Multidimensional Partitioning for Histogram Using Reinforcement Learning

MPhil Thesis Defence


Title: "Multidimensional Partitioning for Histogram Using Reinforcement 
Learning"

By

Mr. Weizhen DING


Abstract

Histograms, as a crucial tool in database management and big data 
analytics, are used for selectivity estimation to get approximate query 
results efficiently. Minskew is a state-of-the-art multi-dimensional 
histogram algorithm that offers decent accuracy in selectivity estimation, 
but it has difficulty in achieving good performance for queries of 
different sizes. To solve this problem, we propose Grid Adaptive Minskew 
and Reinforcement Learning (RL) Histogram. While Minskew generates a set 
of global grid cells that are used for splitting at all levels, Grid 
Adaptive Minskew generates grid cells for each bucket independently, 
resulting in histograms with buckets of more flexible sizes and improved 
accuracy for various selectivities. RL Histogram shares the same data 
structure with Grid Adaptive Minskew. It first learns the knowledge from 
Grid Adaptive Minskew and then explores histograms with policy gradient 
learning algorithm. The experiments demonstrate that Grid Adaptive Minskew 
outperforms Minskew in all aspects, and RL Histogram can further improve 
the performance.


Date:  			Monday, 30 January 2023

Time:			3:00pm - 5:00pm

Venue:			Room 5501
 			lifts 25/26

Committee Members:	Prof. Dimitris Papadias (Supervisor)
 			Prof. Xiaofang Zhou (Chairperson)
 			Prof. Nevin Zhang


**** ALL are Welcome ****