Accelerating Exact Similarity Search on GPUs: Strategies for Distance Calculation and K-selection

MPhil Thesis Defence


Title: "Accelerating Exact Similarity Search on GPUs: Strategies for Distance 
Calculation and K-selection"

By

Mr. Ding TANG


Abstract

Similarity search is a basic operation in database systems and widely used in 
industrial applications to handle complex data like images and user 
information, which are commonly represented by numerical feature vectors. This 
thesis aims to study how to better utilize GPUs for this task. We decompose 
similarity search into two phases, distance calculation and k-selection, and 
analyze their bottlenecks and solutions on GPUs respectively. For each phase, 
we explore several mainstream solutions and re-implement most of them with 
efficient codes. Additionally, we propose and implement several new 
optimizations, including SMML-S and SMML-L, two matrix multiplication kernel 
designs, and BucketSelect-Opt, a k-selection method to accelerate similarity 
search on GPUs. We conduct extensive experiments in different settings to 
investigate the performance of existing and our proposed methods. The results 
show that our proposed methods perform satisfactorily in their target domains. 
Furthermore, based on these experimental results, we provide guidelines on how 
to choose the right strategies for a given situation in each phase.


Date:  			Thursday, 16 June 2022

Time:			10:00am - 12:00noon

Zoom Meeting:
https://hkust.zoom.us/j/94911108012?pwd=YStOa3JPTDRVcC9zWGNISWtlb3RaZz09

Committee Members:	Dr. Kai Chen (Supervisor)
 			Prof. Qiong Luo (Chairperson)
 			Dr. Yangqiu Song


**** ALL are Welcome ****