Rupam Mahmood

Assistant Professor, Faculty of Science - Computing Science


Assistant Professor, Faculty of Science - Computing Science



Assistant Professor

Canada CIFAR AI Chair

PI RLAI; Fellow at Amii

Department of Computing Science

University of Alberta

Personal website:



My research objective is to understand scalable and general learning mechanisms for continually improving agents. To this end, I develop reinforcement and representation learning algorithms and real-time learning systems for controlling physical robots. Currently, I am working on the following new research program.

Continual robot learning with onboard computers

A crucial aspect of intelligence is the ability to learn continually and adapt to changes while interacting with the environment. Although natural intelligence innately shows such ability, current approaches to artificial intelligence (AI) only exhibit partial ability. Current AI systems do not learn continually as they first learn from a large stored dataset by replaying samples repeatedly and then are deployed in the real world, where they interact and perform, typically without further learning. Continual learning entails simultaneously interacting and adapting to changes while retaining useful past knowledge. My research program aims to develop approaches, algorithms, and real-time systems that enable continual learning for real-world robots using only onboard computers.

While continual learning represents a forefront challenge in contemporary AI research, achieving it is vital for real-time systems such as robots. This endeavor is distinctly challenging, as it requires learning to occur using the resource-constrained onboard computers of robots. These computers cannot support large-scale computation or provide the memory necessary for storing large datasets, both of which are essential for current approaches. Continual robot learning with onboard computers calls for new learning algorithms and a richer understanding of real-time learning systems. This research program aims to understand the challenges of continual learning in current learning approaches and develop algorithms and real-time systems for physical robots to overcome them. Algorithms developed for this effort are scalable, general, and applicable beyond onboard learning, making them suitable for advancing general intelligence through large-scale computation. 


CMPUT 652: Reinforcement Learning with Robots (Fall 2019)

In this course, we will study the foundations of RL to be able to develop policy learning methods and learn about systematic ways of studying a real-time system to reveal the uncertainties involved in real-world tasks. This investigation will allow us to understand the differences between real-world and standard simulated tasks so that we can adapt task setups and algorithmic implementations to the real world as well as enhance the simulated tasks to incorporate the additional challenges in real-time systems. En route, we will learn about other promising approaches to learning in robotics that are not performed in real-time, such as learning from demonstration and simulation-to-reality transfer.

CMPUT 397: Reinforcement Learning (Winter 2020)

In this course, we study the design, analysis, and applications of reinforcement learning agents that interact with a complex, uncertain world to achieve a goal. We will emphasize agents that can make near-optimal decisions in a timely manner with incomplete information and limited computational resources. The course will cover Markov decision processes, reinforcement learning, planning, function approximation (online supervised learning) and real-world applications. The course will take an information-processing approach to the concept of mind and briefly touch on perspectives from artificial intelligence, psychology, neuroscience, philosophy, and robotics.

The course will use a recently created MOOC on Reinforcement Learning, created by the Instructors of this course. Much of the lecture material and assignments will come from the MOOC. In-class time will be largely spent on discussion and thinking about the material, with some supplementary lectures.

Pre-requisites: The course will use Python 3. You should either know Python 3 or be sufficiently experienced with programming in other languages that you can learn Python 3 quickly. We will use elementary ideas of probability, calculus, and linear algebra, such as expectations of random variables, conditional expectations, partial derivatives, vectors, and matrices. Students should either be familiar with these topics or be ready to pick them up quickly as needed by consulting outside resources or the teaching assistants.