PhD Thesis Proposal Defence



Mr. Shiyao MA


Fair allocation of network resources for data-parallel applications is a 
challenging undertaking. For one thing, the conflict between the increasing 
volume of communications and limited link bandwidth is becoming growingly 
intense due to the popularization of big data. Moreover, the distributed nature 
of data-parallel tasks exhibits a correlated traffic pattern where a job is 
considered completed only when the coflow—flows of all its constituent 
tasks—has finished, hence rendering schemes on per-flow level fairness 
inapplicable. In face of these challenges, this thesis presents a systematic 
study to ensure the progress of network communications confronting 
data-parallel applications.

Our first insight is that, data locality should be exploited to reduce network 
transfers, thus accelerating application progress and alleviating network 
contention. This is of critical importance to data-processing applications such 
as Hadoop and Spark, which spend a huge amount of time reading input blocks 
scattered on data servers. We propose Custody, a cluster management framework 
that transparently retrieves locality information of input data blocks and 
allocates machines with local data to applications in a fair fashion by solving 
the data- aware resource sharing problem.

Even with data locality in hand, network transfers are still inevitable and are 
often- times enormous, e.g., the shuffling phase in services such as web 
search, video analytics and graph processing. Therefore, network isolation 
should be provided so that the worst case performance of each service is 
assured. We observe that such an isolation guarantee can be maximized by 
careful placement of tasks. A two-step allocation scheme is proposed where we 
first coordinate the placement of tasks based on access link status and 
bandwidth demands of each application, and then enforce the bandwidth 
allocation of tasks within an application.

While per-application network isolation is an ideal persuit that ensures the 
progress of each application, it nonetheless drags down the overall 
performance. This situation becomes more severe when they are carried out under 
hard deadline requirements. Our next endeavor is to share the network links in 
a fair fashion so as to meet the deadlines of as many ap- plications as 
possile. Existing flow-level scheduling schemes are insufficient to guarantee 
the coflow-level application performance since a coflow can meet its deadline 
only when all its constituent flows finish on time. We present Chronos, a 
scheduling framework that captures the correlation of flows belonging to the 
same coflow, and allocates network resource among multiple concurrent coflows 
with deadline in mind.

Date:			Wednesday, 28 February 2018

Time:                  	3:30pm - 5:30pm

Venue:                  Room 1504
                         (lifts 25/26)

Committee Members:	Prof. Bo Li (Supervisor)
  			Prof. Lei Chen (Chairperson)
 			Dr. Brahim Bensaou
 			Dr. Kai Chen

**** ALL are Welcome ****