Frequent Item Finding When Obtaining Support is Costly

MPhil Thesis Defence


Title: "Frequent Item Finding When Obtaining Support is Costly"

By

Mr. Wing Ho LIN


Abstract

Suppose that there are n users and m items, and that the preference of each 
user for the items is revealed only upon probing, which takes time and is 
therefore costly. How can we quickly discover all the frequent items that are 
favored individually by at least a given number of users? This new problem not 
only has strong connections with several well-known problems, including the 
frequent item mining problem and the bichromatic reverse nearest neighbor 
problem, it also finds applications in fields as diverse as sponsored search, 
marketing surveys, and crowdsourcing. Although this problem can be settled 
naively by probing the preferences of all n users, the number of users is 
typically enormous, and probing the preference of a user itself can also incur 
a prohibitive cost. Consequently, a lack of a more efficient algorithm would 
severely limit the practicality of the problem.

To overcome the deficiency above, we present a sampling algorithm that 
drastically reduces the number of users needed to probe to O(log 
m)---regardless of the number of users---as long as slight inaccuracy in the 
output is permitted. For reasonably sized input, our algorithm needs to probe 
only 0.5% of the users, whereas the naive approach needs to probe all of them. 
The experimental results also fully agree with our theoretical analysis: our 
algorithm is faster than both the adapted algorithms and the naive approach by 
up to three orders of magnitude.


Date:			Friday, 4 August 2017

Time:			10:00am - 12:00noon

Venue:			Room 2611
 			Lifts 31/32

Committee Members:	Dr. Raymond Wong (Supervisor)
 			Prof. Dit-Yan Yeung (Chairperson)
 			Prof. Cunsheng Ding


**** ALL are Welcome ****