PhD Thesis Proposal Defence "Web Query Classification" by Mr. Dou Shen Abstract: Web query classification (QC) aims to classify Web users' queries, which are often short and ambiguous, into a set of target categories. QC has many applications including improving page ranking inWeb search, targeted advertisement in response to search queries, and personalization. However, the sparsity and ambiguity of Web queries renders QC fundamentally diRerent from traditional document classification. In this proposal, we present two solutions which can solve the QC problem eRectively and a third potential solution. In the first solution, we enrich both Web queries and target categories through Web search engines and existing taxonomies, based on which we develop two kinds of classifiers as well as an ensemble classifier. The second solution improves thefirst one by introducing a bridging classifier, which takes some existing taxonomies to bridge Web queries and target categories. The third solution, which is applicable to solve the general data sparsity problem in text mining, calculates term relationships from existing text corpus. After that, we augment Naive Bayes classifiers with the discovered term relationships. Preliminary results over an open benchmark dataset from KDDCUP 2005 validate the eRectiveness of the first two solutions. In the future, we hope to further study the first two solutions and test the third solution empirically. Date: Monday, 12 February 2007 Time: 10:00a.m.-12:00noon Venue: Room 3501 lifts 25-26 Committee Members: Dr. Qiang Yang (Supervisor) Dr. Brian Mak (Chairperson) Dr. Lei Chen Dr. James Kwok **** ALL are Welcome ****