Accelerating Site Frequency Spectrum Estimation with Graphics Processors

MPhil Thesis Defence


Title: "Accelerating Site Frequency Spectrum Estimation with Graphics Processors"

By

Mr. Jiuxin Zhao


Abstract

Estimating Site Frequency Spectrum (SFS) from gene sequences is an 
important task in population genetic analysis. SFS is a statistical 
summary that describes the distribution of minor allele frequency (MAF) at 
a set of gene sites from a group of individuals. Due to the high intensity 
of MAF computation, current SFS estimation is limited to a population size 
of a couple of hundreds of individuals. To scale up MAF computation for a 
larger population, we explore the use of graphics processors, or GPUs, as 
a hardware accelerator. Specifically, we develop a software package named 
GAMA (GPU-Accelerated Minor Allele Frequency computation). In GAMA, we 
design a new MAF computation algorithm, which has a lower time complexity 
than the state-of-the-art algorithm, and is suitable for parallelization 
on the GPU. Also, we utilize the local memory and warp synchronization 
mechanism on the GPU to further improve the performance. Finally, we adopt 
a logarithm transformation to avoid the floating point underflow problem 
in the computation. With GAMA, we are able to compute MAF for up to a 
thousand individuals for the first time. On a server equipped with an 
NVIDIA Tesla C2070 GPU and an Intel Xeon E5520 2.27 GHz CPU, GAMA achieves 
a speedup of 47 times over realSFS, an advanced, single-threaded, 
CPU-based SFS estimation program on MAF computation time, and is 3.5 times 
faster than our optimized, 16-thread parallel implementation on the CPU.


Date:			Monday, 21 May 2012

Time:			2:00pm – 4:00pm

Venue:			Room 1504
 			Lifts 25/26

Committee Members:	Dr. Qiong Luo (Supervisor)
 			Dr. Raymond Wong (Chairperson)
 			Dr. Weichuan Yu (ECE)


**** ALL are Welcome ****