Deep Reinforcement Learning in Urban Computing

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "Deep Reinforcement Learning in Urban Computing"

By

Miss Yexin LI


Abstract

Nowadays, urban systems are widely deployed in many major cities, 
e.g. ride-sharing system, express system, take-out food delivering 
system, emergency medical service system, etc. Having modernized and 
facilitated the daily life of citizens significantly, these systems are 
facing severe operation challenges. For example, how to match passengers 
to drivers in a ride-sharing system, how to dispatch couriers in real time 
in an express system, etc. Previously, operation problems in urban systems 
are often tackled by methods in operation research, e.g. Optimization, or 
heuristic algorithms based on practical system settings. For an urban 
system, as we often want to generate a sequence of real-time actions to 
maximize the total reward in a long time, reinforcement learning is a 
proper choice. Besides, as the system is often large and complex, deep 
learning methods are necessary to better capture its representative and 
enriched features.

In this thesis, we investigate how Deep Reinforcement Learning, i.e. DRL, 
can effectively learn operation policies for urban systems. For an urban 
system, according to how it operates, Central-Agent 
Reinforcement Learning, i.e. CARL, or Multi-Agent Reinforcement Learning, 
i.e. MARL, can be chosen to describe its operation process. For a system 
whose operation is described by CARL, we focus on how to properly 
formulate the problem and design each component of the model, i.e. the 
state, action, and immediate reward, thus to optimize the final target of 
the system. We adopt the take-out food system as an example and propose 
a Deep Reinforcement Order Packing model, i.e. DROP, to solve the 
operation problem in it. For a system whose operation can be described 
by MARL, besides designing each component of the model, we also try 
to guarantee that agents in the system cooperate with each other 
properly. We adopt the express system as an example, where there are 
many couriers working it, and propose a Deep Reinforcement 
Courier Dispatching model, i.e. DRCD, to solve the operation problem in 
it. DRCD can guarantee the cooperation among couriers to some extent but 
not globally, therefore, we further propose a Cooperative 
Multi-Agent Reinforcement Learning model, i.e. CMARL, to guarantee the 
cooperation among couriers globally by incorporating another Markov 
Decision Process along the agent sequence. Experiments based on real-world 
data are conducted to confirm the superiority of DROP, DRCD, and CMARL, 
compared with baselines. In MARL, besides cooperation among 
agents, competition also exists, although it is not common in modern 
urban systems. We briefly discuss about this scenario at the end to make 
the thesis complete.


Date:			Wednesday, 11 November 2020

Time:			2:00pm - 4:00pm

Zoom Meeting: 
https://hkust.zoom.com.cn/j/98029257612?pwd=RTkvOVJSbGU2dnpseWtnQ2NndVJJZz09

Chairperson:		Prof. Kani CHEN (MATH)

Committee Members:	Prof. Qiang YANG (Supervisor)
 			Prof. Kai CHEN
 			Prof. Qiong LUO
 			Prof. Hai YANG (CIVL)
 			Prof. Jiannong CAO (PolyU)

**** ALL are Welcome ****