Towards Efficient, Secure and Cost-effective Large-Scale Systems for Machine Learning

PhD Thesis Proposal Defence


Title: "Towards Efficient, Secure and Cost-effective Large-Scale Systems for
Machine Learning"

by

Mr. Chengliang ZHANG


Abstract:

Machine learning techniques have advanced in leaps and bounds in the past 
decade. As ML's success critically relies on the abundant computing power 
and the availability of big data, it is impractical to host ML 
applications on a single machine. By distributing ML workload and training 
data across multiple machines, we are able to substantially improve the 
productivity of ML applications. As large-scale ML applications are 
increasingly deployed in production systems, how to improve efficiency, 
protect data security, and reduce cost have become pressing needs in the 
deployment of large-scale ML applications. Specifically, there are three 
unique challenges that must be addressed. First, how to efficiently train 
an ML model in a cluster in the presence of heterogeneity? Second, once 
the model is trained, how do we serve it with minimal cost while 
maintaining service-level objectives (SLOs)? Lastly, now that federated 
learning (FL) is proposed to protect data privacy, how to practice it 
without compromising speed and model quality becomes the problem.

Unfortunately, existing works do not provide satisfactory solutions to the 
three challenges. First, traditional ML systems often conduct asynchronous 
training to improve resource utilization. While it maximizes the rate of 
updates, the price paid is degraded training quality. Second, ML serving 
is much more computation intensive and harder to scale, applying generic 
cloud scaling methods on ML serving can lead to high resource wastage and 
poor latency performance. Third, Homomorphic Encryption (HE) can be 
conveniently adopted to preserve data privacy in FL without sacrificing 
model accuracy. However, HE induces prohibitively high computation and 
communication overheads which make it impractical for state-of-the-art 
models. To answer the above three unique challenges in large-scale ML 
systems, we profile, analyze, and propose new strategies to achieve 
efficiency, security, and cost-effectiveness.

To address the _rst problem, we propose a new distributed ML scheme, 
termed speculative synchronization. Our scheme allows workers to speculate 
about the recent parameter updates from others on the y, and if necessary, 
the workers abort the ongoing computation, pull fresher parameters, and 
start over to improve the quality of training. We implement our scheme and 
demonstrate that speculative synchronization achieves substantial speedups 
over the asynchronous parallel scheme with minimal communication overhead.

Second, to tackle the dual challenge of SLO compliance and cost 
effectiveness, we proposes a general-purpose ML serving system called MArk 
(Model Ark). To start, MArk dynamically batches requests and 
opportunistically serves them using accelerators for improved 
performance-cost ratio. Then, instead of relying on over-provisioning, 
MArk employs predictive autoscaling to hide the provisioning latency at 
low cost. Last, MArk exploits the stateless nature of inference serving by 
utilizing flexible, yet costly serverless instances to cover unexpected 
load spikes. We show that MArk can greatly reduce the serving cost while 
scoring better latency performance compared with popular industry 
solutions.

Last, we present BatchCrypt, a system solution for cross-silo FL that 
significantly reduces the encryption and communication overhead caused by 
HE. Instead of encrypting individual gradients with full precision, we 
encode a batch of quantized gradients into a long integer and encrypt it 
in one go. To allow gradient-wise aggregation to be performed on 
ciphertexts of the encoded batches, we develop new quantization and 
encoding schemes along with a novel gradient clipping technique. Our 
evaluations confirm that BatchCrypt can effectively reduce the computation 
and communication overhead.


Date:                   Friday, 12 June 2020

Time:                   3:00pm - 5:00pm

Zoom Meeting:           https://hkust.zoom.us/j/99778106038

Committee Members:      Dr. Wei Wang (Supervisor)
                        Prof. Bo Li (Chairperson)
                        Dr. Kai Chen
                        Prof. Qian Zhang


**** ALL are Welcome ****