Chen Li (陈力) is currently a Ph.D. student in the Department of Computer Science and Engineering at The Hong Kong University of Science and Technology. He is working on topics in data center networking under the supervision of Prof. Kai Chen. He is a Microsoft Research Asia PhD Fellow, and has published 10+ peer-reviewed papers in top journals and conferences. His network acceleration subsystems and scheduling algorithms have seen deployment in Big Data systems in Huawei and Tencent.
Prior to the current program, Li received his Master of Philosophy and Bachelor of Engineering (with First Class Honors and Minor in Mathematics) in Electronic and Computer Engineering from HKUST in 2013 and 2011, respectively. Full CV and references are available upon request.
Ph.D in Computer Science
HKUST (2013 - 2018)
M.Phil in Electronic & Computer Engineering
HKUST (2011 - 2013)
B.Eng in Electronic & Computer Engineering
HKUST (2007 - 2011)
With the rapid growth of model complexity and data volume, deep learning systems require more and more servers to perform parallel training. Currently, deep learning systems with multiple servers and multiple GPUs are usually implemented in a single cluster, which typically employs Infiniband fabric to support Remote Direct Memory Access (RDMA), so as to achieve high throughput and low latency for inter-server transmission. It is expected that, with ever-larger models and data, deep learning systems must scale to multiple network clusters, which necessitates highly efficient inter-cluster networking stack with RDMA support. Since Infiniband is only suited for small-scale clusters of less than thousands of servers, we believe RDMA-over-Converged-Ethernet (RoCE) is a more appropriate networking technology choice for multi-cluster datacenter-scale deep learning. Therefore, we endeavor to incorporate RoCE as the networking technology for deep learning systems, such as Tensorflow and Tencent's Amber.
Angel is an in-house large scale machine learning framework in Tencent. We cooperated with Technology Engineering Group (TEG), and developed a network accelerator. Via algorithm-specific flow scheduling, We achieved 70x reduction in job completion time compared to vanilla Apache Spark.
Datacenters exists because of a standalone server/rack can no longer meet the requirements of modern day applications: web search, ad recommendation, online commerce, machine learning, etc. Different from traditional networks, data center networks enjoy high bandwidth, low latency, and minimal packet loss. These features, however, are not fully utilized today, because application developers are usually unfamiliar with datacenter environment and/or networking stack and its tuning. We aim to design a system for application developers to access networking functions in datacenters and unlock its full potential.
Certified Instructor at NVIDIA Deep Learning Institute (Oct 2017 - Now):
Teaching Assistant at HKUST:
MSRA Ph.D Fellowship2011 - Now
HKUST Postgraduates Studentship2013
HKUST Research Travel Grant2010
Meritorious Winner of Mathematical Competition of Modeling2010
The Commercial Radio 50th Anniversary Scholarships2007 - 2011
HKUST Scholarship for Continuing UG Students