Unsupervised Video Object Segmentation for Deep Reinforcement Learning

Speaker:        Professor Pascal Poupart
                University of Waterloo

Title:          "Unsupervised Video Object Segmentation for Deep
                 Reinforcement Learning"

Date:           Tuesday, 20 August 2019

Time:           11:00am - 12 noon

Venue:          Room 4472 (via lift no. 25/26), HKUST

Abstract:

I will present a new technique for deep reinforcement learning that
automatically detects moving objects and uses the relevant information for
action selection. The detection of moving objects is done in an
unsupervised way by exploiting structure from motion. Instead of directly
learning a policy from raw images, the agent first learns to detect and
segment moving objects by exploiting flow information in video sequences.
The learned representation is then used to focus the policy of the agent
on the moving objects. Over time, the agent identifies which objects are
critical for decision making and gradually builds a policy based on
relevant moving objects. This approach, which we call Motion-Oriented
REinforcement Learning (MOREL), is demonstrated on a suite of Atari games
where the ability to detect moving objects reduces the amount of
interaction needed with the environment to obtain a good policy.
Furthermore, the resulting policy is more interpretable than policies that
directly map images to actions or values with a black box neural network.
We can gain insight into the policy by inspecting the segmentation and
motion of each object detected by the agent. This allows practitioners to
confirm whether a policy is making decisions based on sensible
information. Our code is available at https://github.com/vik-goel/MOREL


******************
Biography:

Pascal Poupart is a Full Professor in the David R. Cheriton School of
Computer Science at the University of Waterloo, Waterloo (Canada). He is
the Research Director of the Waterloo Borealis AI Research Lab funded by
the Royal Bank of Canada. He is a faculty member of the Waterloo AI
Institute and the Vector Institute. He serves as scientific advisor for
Huawei Technologies and ProNavigator. He received the B.Sc. in Mathematics
and Computer Science at McGill University, Montreal (Canada) in 1998, the
M.Sc. in Computer Science at the University of British Columbia, Vancouver
(Canada) in 2000 and the Ph.D. in Computer Science at the University of
Toronto, Toronto (Canada) in 2005. His research focuses on the development
of algorithms for reasoning under uncertainty and machine learning with
application to Assistive Technologies, Natural Language Processing ad
Telecommunication Networks. He is most well known for his contributions to
the development of approximate scalable algorithms for partially
observable Markov decision processes (POMDPs) and their applications in
real-world problems, including automated prompting for people with
dementia for the task of handwashing and spoken dialog management. Other
notable projects that his research team are currently working on include
deep learning with clear semantics, structure learning, personalized
transfer learning, conversational agents, adaptive satisfiability, sports
analytics and stress detection based on wearable devices.

Pascal Poupart received a Cheriton Faculty Fellowship (2015-2018), a best
student paper honourable mention (SAT-2017), a silver medal at the
SAT-2017 competition, an outstanding collaborator award from Huawei Noah's
Ark (2016), a top reviewer award (ICML-2016), a gold medal at the SAT-2016
competition, a best reviewer award (NIPS-2015), an Early Researcher Award
from the Ontario Ministry of Research and Innovation (2008), two Google
research awards (2007-2008), a best paper award runner up (UAI-2008) and
the IAPR best paper award (ICVS-2007). He serves as associate editor of
the Journal of Artificial Intelligence Research (JAIR) (2017 - present),
member of the editorial board of the Journal of Machine Learning Research
(JMLR) (2009 - present) and guest editor for the Machine Learning Journal
(MLJ) (2012 - present). He routinely serves as area chair or senior
program committee member for NIPS, ICML, AISTATS, IJCAI, AAAI and UAI. His
research collaborators include Borealis AI, Huawei Technologies, Google,
Intel, Ford, ProNavigator, SportLogic, Scribendi, Kik Interactive, In the
Chat, Slyce, HockeyTech, the Alzheimer Association, the UW-Schlegel
Research Institute for Aging, Sunnybrook Health Science Centre and the
Toronto Rehabilitation Institute.