Data Management for High Performance Computing

PhD Qualifying Examination


Title: "Data Management for High Performance Computing"

Mr. Mian LU


Abstract:

High Performance Computing (HPC) applications usually involve intensive 
computing over large amounts of data. These applications seldom adopt 
off-the-shelf database systems due to their custom data models, complex 
computation methods, and high performance requirements. Even though HPC 
platforms are mostly supercomputers, recent commodity hardware, e.g., 
graphics processors, has also become a promising alternative.

This survey presents data management issues in HPC, in particular, for 
scientific computing applications. First, we give an overview on storage 
and file systems, I/O access patterns, data types, data structures, data 
parallel primitives, and methods for data transfer, distribution and 
replication. Then, we study three major techniques for I/O access 
patterns, including data sieving, collective I/O and prefetching. Next, we 
discuss scatter, gather, prefix scan and sort, which are four data 
parallel primitives commonly used in HPC, and study their implementations 
on commodity hardware. Finally, we outline a few potential research 
directions in data management for HPC.


Date:     		Thursday, 15 January 2009

Time:                   10:30a.m.-12:30p.m.

Venue:                  Room 3501
 			lifts 25-26

Committee Members:      Dr. Qiong Luo (Supervisor)
 			Prof. Frederick Lochovsky (Chairperson)
 			Prof. Dik-Lun Lee
 			Prof. Lionel Ni


**** ALL are Welcome ****