Zero Copy Transport in Distributed Dataflow Applications

MPhil Thesis Defence


Title: "Zero Copy Transport in Distributed Dataflow Applications"

By

Mr. Bairen YI


Abstract

Dataflow is a common programming paradigm for processing data in a 
distributed fashion. When programming dataflow applications, a data 
processing task is expressed as a dataflow graph, with its vertices as 
specific operations and its edges as input/output relations or dataflow 
dependencies between operations. When deployed inside a datacenter in 
which a large number of processors are available, the dataflow graph is 
partitioned and placed onto different processors for improved processing 
throughput. For graph edges that cross the partition boundaries, data 
chunks need to be transferred between different processors. Increasingly 
higher data volumn and larger processing power of individual processors 
often bringing in communication bottleneck onto the inter-processor links, 
resulting serious performance degradation to the distributed dataflow 
applications.

In recent years, Remote Direct Memory Access (RDMA) becomes widely 
deployed in data center as an alternative to the Transport Control 
Protocol (TCP). RDMA offers ultra-low latency and CPU bypass networking to 
application programmers. Existing applications are often designed around 
socket based software stack that manages application buffers separately 
from networking buffers and does memory copies between them when sending 
and receiving data. With large sized (up to hundreds MB) application 
buffers, the cost of such copies adds non-trivial overhead to the 
end-to-end communication pipeline. In this work, we made an attempt to 
design a zero copy transport for distribute dataflow applications that 
unifies application and networking buffer management and completely 
eliminates unnecessary memory copies. Our prototype on top of TensorFlow 
shows 2.43x performance improvement over gRPC based transport and 1.21x 
performance improvement over an alternative RDMA transport with private 
buffers and memory copies.


Date:			Wednesday, 12 June 2019

Time:			4:00pm - 6:00pm

Venue:			Room 5501
 			Lifts 25/26

Committee Members:	Dr. Kai Chen (Supervisor)
 			Dr. Ke Yi (Chairperson)
 			Dr. Wei Wang


**** ALL are Welcome ****