Communication-Efficient Algorithms for Tracking Distributed Data Streams

PhD Thesis Proposal Defence


Title: "Communication-Efficient Algorithms for Tracking Distributed Data Streams"

by

Mr. Qin ZHANG


Abstract:

We investigate several basic problems in the distributed streaming model. 
In the this model, we have k sites, each receiving a stream of elements 
over time. There is a designated coordinator who would like to track, that 
is, maintain continuously at all times, some function f of all the 
elements received from the k sites. There is a two-way communication 
channel between each site and the coordinator, and the goal is to track f 
with minimum communication. This model is motivated by applications in 
distributed databases, network monitoring and sensor networks.  In this 
thesis, we design algorithms to track some fundamental and useful 
functions, including random sampling, heavy hitters and quantiles. We also 
show that our algorithms have optimal communication costs in the worst 
case (up to some polylogarithmic factors in a few cases). In addition, we 
observed that for some problems considering the worst-case communication 
cost is meaningless, so we propose to use competitive analysis and give 
online tracking algorithms whose performance is competitive against the 
optimal offline algorithm that knows the entire stream in advance.

Although this thesis is primarily concerned with the theory of distributed 
streaming, we expect that our study will also have a significant impact on 
real-world distributed streaming systems.


Date:  			Wednesday, 16 December 2009

Time:                   10:00am - 12:00noon

Venue:                  Room 1504
 			lifts 25/26

Committee Members:      Prof. Mordecai Golin (Supervisor)
 			Dr. Ke Yi (Supervisor)
 			Dr. Sunil Arya (Chairperson)
 			Prof. Siu-Wing Cheng
 			Dr. Raymond Wong


**** ALL are Welcome ****