Scaling Graph Neural Network Training on Large Graphs for Effectiveness and Efficiency

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "Scaling Graph Neural Network Training on Large Graphs for 
Effectiveness and Efficiency"

By

Miss Jingshu PENG


Abstract

Graph data has been prevalent with its ability to model many real-life 
applications woven by huge and complex networks of relations and 
interactions. Recently, graph neural networks (GNNs) have come into the 
spotlight in their great success to model the tangled relations in graph 
data, as a powerful machine learning tool. However, the major challenge 
that limits the adoption and deployment of GNNs to large-scale graphs in 
the real world, despite their promising performance, lies in the inability 
to utilize all the data in finite time and the scalability of the 
algorithm itself.

To mitigate the memory requirement, tremendous efforts have been made 
towards sampling-based mini-batch GNN training. Therefore, various 
sampling strategies have been proposed, aiming to improve the training 
efficiency and scalability of current GNN models over large-scale graphs, 
such as node-wise sampling, layer-wise importance sampling, and the 
fundamentally different subgraph sampling. Though existing samplers can 
help GNNs to scale over large-scale graphs, information loss from not 
sufficiently extracting higher-order neighbors inevitably limits the 
expressive power of these shallow architectures. Moreover, the size of the 
graph data and the complexity of the model can easily reach the limit of a 
single device.

To mitigate the memory requirement with ever-increasing graph data and 
model size, distributed GNN processing is the inevitable remedy. Yet, a 
distributed implementation of the GNN learning algorithms, especially an 
efficient one, is nowhere near trivial. Aside from the substantial memory 
footprint, distributed GNNs are memory-intensive as well as 
compute-intensive due to coupled irregular neighborhood access and 
iterative learning procedure. Consequently, both the intensity and the 
volume of data communication and the balance of workload makes distributed 
GNNs more challenging. Some efforts have been made towards distributed GNN 
training, at the price of heavy preprocessing, complex workflow, sampling 
overhead, information loss, and no convergence guarantee.

In this thesis, we investigate the scaling of GNN training on large graphs 
in pursuit of effectiveness and efficiency. First, we propose to further 
advance the effectiveness of sampling-based GNNs in a single device 
environment. We propose an adaptive and structure-aware important subgraph 
sampler to make adaptive GNNs obtain the benefit from higher-order 
neighbor information. Evaluation on eight benchmark datasets demonstrates 
the competitive performance of our approach to six SOTA approaches.

Second, we propose to study efficient distributed GNN processing over 
large graphs while preserving the effectiveness. We identify one major 
archenemy of efficient distributed GNN processing is the cross-device data 
communication by virtue of such data movement in GNN aggregation. To avoid 
the communication, we propose to abstract decentralized GNN processing as 
sequential matrix multiplication and uses historical embeddings via cache. 
Theoretically, we show bounded approximation errors of embeddings and 
gradients with convergence guarantee. Empirically, evaluation on five 
large-scale benchmark datasets with prevalent GNN models via four 
different system setups, shows its ability to generalize and superiority 
in efficiency while preserving effectiveness, compared to five SOTA 
systems. The performance with little or no accuracy loss demonstrates 
consistency with our theoretical findings.

Finally, we conclude this thesis with future directions and challenges.


Date:			Thursday, 18 August 2022

Time:			2:00pm - 4:00pm

Zoom Meeting: 
https://hkust.zoom.us/j/98633747394?pwd=UVY0NFlkYmw0UmlXelJVdXVsTkNyQT09

Chairperson:		Prof. Wai Ho MOW (ECE)

Committee Members:	Prof. Lei CHEN (Supervisor)
 			Prof. Lionel NI (Supervisor)
 			Prof. Qiong LUO
 			Prof. Xiaofang ZHOU
 			Prof. Can YANG (MATH)
 			Prof. Jianliang XU (HKBU)


**** ALL are Welcome ****
Privacy Sitemap
Scaling Graph Neural Network Training on Large Graphs for Effectiveness and Efficiency

About

People

Research

Academics

Admissions