MPhil Thesis Defence


"Mining User Preference using Spy Voting for Search Engine
Personalization"

By

Mr. Lin Deng


Abstract

The World Wide Web (the Web) is serving an increasingly large and
diversified user community. The diversity of user interests makes it
difficult for a general Web search engine to meet the needs of an
individual user. This thesis addresses the problem of Web search engine
personalization. The main objectives of studying the personalization are
to understand a user's preference and to provide the searched information
that satisfies that preference. We present a new approach that mines
users' preferences on the search results from clickthrough data and adapts
the search engine's ranking function to improve search quality.

Existing preference mining algorithms are typically based on strong
assumptions on how users scan the search results. Thus, the preferences
derived are often incorrect. In this thesis, we develop a new preference
mining technique called SpyNB, which is based on a more reasonable
assumption that the search results clicked by a user reflect the user's
preference, but it does not make any conclusions about those that the user
did not click. As such, SpyNB is still valid even if the user does not
follow any order in reading the search results or has not clicked on all
relevant results.

We develop a spying process to infer the negative examples by first
treating the result items clicked by the users as sure positive examples
and those not clicked by the users as unlabelled data. Then, we plant the
sure positive examples (the spies) into the unlabelled set of result items
and then apply Naive Bayes classification to generate the reliable
negative examples (thus the name "SpyNB"). These positive and negative
examples allow us to discover highly accurate user preferences. Finally,
we employ a ranking SVM to build a metasearch engine optimizer. The
optimizer gradually adapts our metasearch engine according to the user's
preference.

In order to verify the effectiveness of SpyNB for preference mining, we
conduct both offline and online experiments. Our extensive offline
experiments demonstrate that SpyNB discovers much more accurate
preferences than the existing algorithms. Moreover, the adaptive ranking
function derived from SpyNB improves retrieval quality by 20% compared to
the case without learning. The interactive online experiments further
confirm that SpyNB and our personalization approach are effective in
practice. We also show that the efficiency of SpyNB is comparable to the
existing simple preference mining algorithms.


Date:				Wednesday, 18 January 2006

Time:				2:30p.m.-4:30p.m.

Venue:				Room 4480
				Lifts 25-26

Committee Members:		Prof. Dik-Lun Lee (Supervisor)
				Dr. Wilfred Ng (Supervisor)
				Dr. Nevin Zhang (Chairperson)
				Dr. Lei Chen


**** ALL are Welcome ****