Exploring Deep Learning Architectures for Spatiotemporal Sequence Forecasting

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering

PhD Thesis Defence

Title: "Exploring Deep Learning Architectures for Spatiotemporal Sequence


Mr. Xingjian SHI


Spatiotemporal systems are common in the real world. Forecasting the 
multi-step future of these spatiotemporal systems based on the past 
observations, or, Spatiotemporal Sequence Forecasting (STSF), is a 
significant and challenging problem. Due to the complex spatial and 
temporal relationships within the data and the potential long forecasting 
horizon, it is challenging to design appropriate Deep Learning (DL) 
architectures for STSF. In this thesis, we explore DL architectures for 
STSF. We first define the STSF problem and classify it into three 
subcategories: Trajectory Forecasting of Moving Point Cloud (TF-MPC), STSF 
on Regular Grid (STSF-RG), and STSF on Irregular Grid (STSF-IG). We then 
propose architectures for STSF-RG and STSF-IG problems.

For STSF-RG problems, we proposed the Convolutional Long-Short Term Memory 
(ConvLSTM) and the Trajectory Gated Recurrent Unit (TrajGRU). ConvLSTM 
uses convolution in both the input-state and state-state transitions of 
LSTM and is better at capturing the spatiotemporal correlations than the 
Fully-connected LSTMĀ (FC-LSTM). TrajGRU improves upon ConvLSTM by actively 
learning the recurrent connection structure, which achieves better 
prediction performance with less number of parameters. To better 
investigate the effectiveness of our proposed architectures and other DL 
models for STSF-RG, we chose to tackle the precipitation nowcasting 
problem, which is a representative STSF-RG problem with huge real-world 
impact. By incorporating ConvLSTM into an Encoder-Forecaster (EF) 
structure, we proposed the first machine learning based solution to 
precipitation nowcasting that outperforms the operational algorithm. To 
facilitate future studies for this problem and gauge the state of the art, 
we proposed the first large-scale benchmark for precipitation nowcasting: 
HKO-7. HKO-7 has new evaluation metrics and has both the offline setting 
and the online setting in the evaluation protocol. We evaluated seven 
models in the offline and online setting. Experiment results show that 1) 
all deep learning models outperform the optical flow based models, 2) 
TrajGRU attains the best overall performance among deep learning models, 
and 3) models consistently perform better in the online setting.

For STSF-IG problems, we converted the sparsely distributed observations 
into data on a spatiotemporal graph and utilized graph convolution 
operators, or graph aggregators, to build the model. We proposed a new 
graph aggregator called Gated Attention Network (GaAN). GaAN not only uses 
multiple attention heads to aggregate information from the neighborhoods 
but also uses another set of gates to control each attention head's 
importance. With experiments on two large-scale inductive node 
classification datasets, we showed that GaAN outperforms the baseline 
graph aggregators. Also, we proposed a unified framework called Graph 
GRUĀ (GGRU), which transforms any valid graph aggregators to RNNs that are 
designed for STSF-IG. We compared GGRU with other state-of-the-art methods 
in traffic speed forecasting and found it achieves the best overall 

Date:			Wednesday, 31 October 2018

Time:			10:00am - 12:00noon

Venue:			Room 2408
 			Lifts 17/18

Chairman:		Prof. Tao Liu (PHYS)

Committee Members:	Prof. Dit-Yan Yeung (Supervisor)
 			Prof. Yangqiu Song
 			Prof. Raymond Wong
 			Prof. Weichuan Yu (ECE)
 			Prof. Michael Lyu (CUHK)

**** ALL are Welcome ****