A Survey on Modeling Word Burstiness

PhD Qualifying Examination


Title: "A Survey on Modeling Word Burstiness"

by

Mr. Di JIANG


Abstract:

Multinomial distributions are often used to model text documents. However,
they do not capture well the phenomenon that words in a document tend to
appear in burstiness: if a word appears once, it is more likely to appear
again. The failure of capturing burstiness hinders the conventional
models' wide application in information retrieval and natural language
processing. We recognize that a critical review of existing models is
needed in order to design and develop better paradigms that are able to
match the diverse challenging issues that rise in burstiness-aware
document modeling. Within a unifying set of notations and terminologies,
we describe in this paper the eorts and main techniques for modeling word
burstiness and present a comprehensive survey of a number of the
state-of-the-art approaches. We classify the burstiness models into three
major categories based on the techniques they adopt in order to provide
insights into how and why the techniques are eective. We also discuss
several real-world applications in which the burstiness-aware models
demonstrate superior performance compared to the multinomial
distributions.


Date:                   Thursday, 2 August 2012

Time:                   2:00pm - 4:00pm

Venue:                  Room 3501
                         lifts 25/26

Committee Members:	Dr. Wilfred Ng (Supervisor)
                         Prof. Shing-Chi Cheung (Chairperson)
 			Prof. Dik-Lun Lee
 			Dr. Raymond Wong


**** ALL are Welcome ****