Flow Scheduling for Parallel Computing Applications in Datacenters

PhD Thesis Proposal Defence


Title: "Flow Scheduling for Parallel Computing Applications in Datacenters"

by

Mr. Li CHEN


Abstract:

Distributed and parallel computing systems are cornerstones of this era of 
Big Data, machine learning, and arti ficial intelligence. This type of 
computing systems span over hundreds or thousands of machines in 
datacenter(s), so as to cope with the ever-expanding data volume and the 
increasing complexity of models/problems. For example, most of the recent 
interesting applications, such as web search, recommendation systems, and 
deep networks, run on clusters of thousands of machines for both small 
companies and large enterprises. As such scale, the communication between 
machines is a bottleneck issue, and the scheduling of communication 
sessions within applications (or network flows) is a key factor in the 
acceleration of these applications.

This thesis focus on flow scheduling problems in datacenters. 
Specifically, we look at three important scheduling problems in real-world 
applications in datacenters:

  1. Scheduling general flows for applications without knowledge of flow 
size, such as database query/response. We adopted the Multi-Level-Feedback 
queues in operating systems to network flows, and developed a queueing 
theory model to determine the optimal parameter settings.

  2. Scheduling flows with or without completion time constraints 
(deadlines), which impacts user-facing applications, such as web search. 
We constructed a systematic solution for this problem, and derive a 
congestion window update function using Lyapunov Optimization techniques 
in control theory.

  3. Scheduling groups of flows with semantic relationships (coflows) with 
application transparency, which emerges from data processing pipelines, 
such as Hadoop MapReduce. We used unsupervised learning to identify 
relationships between flows, and we designed an error-tolerant scheduling 
to mitigate the impact of mis-identification.

We present the proposed solutions for each problem, and demonstrate the 
effectiveness via extensive simulations and experiments.


Date:			Friday, 2 December 2016

Time:                  	2:00pm - 4:00pm

Venue:                  Room 3494
                         (lifts 25/26)

Committee Members:	Dr. Kai Chen (Supervisor)
  			Prof. Gary Chan (Chairperson)
 			Dr. Wei Wang
 			Prof. Qian Zhang


**** ALL are Welcome ****