Visual Summarization and Exploration of Text Streams

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "Visual Summarization and Exploration of Text Streams"

By

Mr. Weiwei Cui


Abstract

We are in the midst of a data explosion. Data in text format such as 
digitalized textural data and data from new social media like blogs and Twitter 
have been generated at an unprecedented rate. For example, Google Books has 
scanned and digitalized15 million books, greatly increasing the accessibility 
of information all around the world. Twitter publishes more than300 new 
messages every second, and the numbers keep increasing. However, exploring and 
analyzing this enormous amount of data become increasingly difficult. 
Information visualization can help analyze huge and complex data by turning 
them into visual representations to exploit the tremendous pattern-recognition 
capability of the human visual system.

In this thesis, we propose three advanced text visualization techniques for 
summarizing and exploring various relation patterns existing in large 
time-varying text document collections. This thesis is composed of three main 
parts, each of which addresses an important problem in text visualization. In 
the first part, we present an enhanced word cloud layout that keeps the 
semantic relations between the displayed words in a sequence of word clouds 
generated over time for dynamic document data. In the second part, TextWheel is 
introduced to visualize complex micro-macro relations within news streams. In 
the last part, we deal with the splitting/merging patterns between topics that 
are extracted from text streams. We proposed TextFlow, which is inspired by 
river flows, to show various topic evolution patterns at different 
granularities. The effectiveness of these methods has been demonstrated through 
extensive experiments using both synthetic data and data from real 
applications.


Date:			Wednesday, 3 August 2011

Time:			4:00pm – 6:00pm

Venue:			Room 5507
 			Lifts 25/26

Chairman:		Prof. Limin Zhang (CIVL)

Committee Members:	Prof. Huamin Qu (Supervisor)
 			Prof. Long Quan
 			Prof. Chiew-Lan Tai
                      	Prof. Kai Tang (MECH)
                         Prof. Han-Wei Shen (Ohio State Univ.)


**** ALL are Welcome ****