Analyzing Data with Regret Minimization Query

PhD Thesis Proposal Defence


Title: "Analyzing Data with Regret Minimization Query"

by

Miss Min XIE


Abstract:

Extracting interesting tuples from a large database is an important 
problem in multi-criteria decision making. Two representative queries were 
proposed in the literature: top-k queries and skyline queries. A top-k 
query requires users to specify their utility functions beforehand and 
then returns k tuples to the users. A skyline query does not require any 
utility function from users but it puts no control on the number of tuples 
returned to users. Recently, a regret minimization query was proposed and 
received attention from the community because it does not require any 
utility function from users and the output size is controllable, and thus 
it avoids those deficiencies of top-k queries and skyline queries. 
Specifically, it returns a small set of tuples such that any user's 
favorite tuple in this subset is guaranteed to be not much worse than 
his/her favorite tuple in the whole database. In a sense, this query finds 
a small set of tuples that makes the user happy (i.e., not regretful) even 
if s/he gets the best tuple in the selected set but not the best tuple 
among all tuples in the database.

In this paper, we study two variations of the regret minimization query: 
(1) the min-error variation: we want to minimize the regret level of each 
user while guaranteeing the output size is at most k where k is the 
maximum output size. This problem is also known as the k-regret query, and 
(2) the min-size variation: we want to determine the least tuples needed 
to keep users happy at a given level. We term this problem as the 
alpha-happiness query where alpha represents a happiness threshold. In 
both variations, we quantify the user's regret level (and happiness level) 
by a criterion, called the regret ratio (and the happiness ratio). A small 
regret ratio (i.e., a large happiness ratio) indicates the user is 
satisfied with the set returned.

As this is an NP-hard problem, we derive efficient approximate solutions 
for both variations. We performed extensive experiments using both real 
and synthetic datasets. Our evaluations show that our algorithms 
outperform the best-known previous approaches in two ways: (i) they answer 
the regret minimization queries by guaranteeing a smaller regret ratio and 
returning fewer tuples to users and, (ii) they do so much faster (up to 
two orders of magnitude times improvement).


Date:			Monday, 25 February 2019

Time:                  	12:30pm - 2:30pm

Venue:                  Room 4472
                         (lifts 25/26)

Committee Members:	Dr. Raymond Wong (Supervisor)
 			Prof. Shing-Chi Cheung (Chairperson)
 			Prof. Lei Chen
 			Dr. Wilfred Ng


**** ALL are Welcome ****