Visual Analysis of Relational Patterns in Multidimensional Data

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "Visual Analysis of Relational Patterns in Multidimensional Data"

By

Mr. Nan CAO


Abstract

Multidimensional data are commonly used to represent both structured and 
unstructured information. Understanding the innate relations among 
different dimensions and data items is one of the most important tasks for 
multidimensional data analysis. However, relational data patterns such as 
correlations, co-occurrences, and many semantic relations such as 
causality, topics and clusters are usually difficult for users to detect 
as the data are usually heterogeneous in nature, huge in amount, and 
contain various statistical features. Although many fundamental data 
analysis techniques such as clustering and correlation analysis have been 
widely used in various application domains, it is still difficult for 
users to understand, interpret, compare, and evaluate analysis results 
given the lack of context information. Information visualization can be of 
great value for multidimensional data analysis as it can represent the 
data in intuitive ways with rich context over multiple dimensions and also 
support explorative visual analysis that keeps humans in the loop.

In this thesis, we introduce advanced visual analysis techniques for 
uncovering relational patterns in complicated multidimensional datasets 
including the structured multivariate data, unstructured text documents, 
and heterogeneous datasets like social media data that contain both 
structured and unstructured information. Multiple visualizations are 
designed for these three data types to represent relational patterns 
within the same or across different information facets. First, for 
multivariate data, we introduce DICON which is an icon-based cluster 
visualization that embeds statistical information into a multi-attribute 
display to facilitate cluster interpretation, evaluation, and comparison. 
Then, for unstructured documents, we design a set of visual analysis 
systems, ContexTour, FacetAtlas, and Solarmap, for topic analysis based on 
our proposed multifaceted entity relational data model. These systems 
respectively represent the multifaceted topic patterns among name 
entities, the multi-relational patterns within topics inside the same 
information facet, and the semantic relational patterns within topics 
across different information facets. Finally, for heterogeneous data such 
as twitter datasets, we introduce Whisper for visualizing dynamic 
relationships between users in context of the information diffusion 
processes of a given event.  These relations contain information from 
three key aspects: temporal trend, social-spatial extent, and community 
response of a topic of interest.

To our best knowledge, the above techniques are cutting-edge studies of 
visually analyzing relational patterns in structured, unstructured, and 
heterogeneous multidimensional datasets. To show the power and usefulness 
of our study, all the proposed visual analysis systems and corresponding 
techniques have been deployed to real datasets and have been formally 
evaluated by domain experts or common users.


Date:			Wednesday, 22 August 2012

Time:			3:00pm – 5:00pm

Venue:			Room 3501
 			Lifts 25/26

Chairman:		Prof. Jang-Kyo Kim (MECH)

Committee Members:	Prof. Huamin Qu (Supervisor)
 			Prof. Long Quan
 			Prof. Qiang Yang
 			Prof. Weichuan Yu (ECE)
                         Prof. Klaus Mueller (Comp. Sci., Stony Brook Univ.)


**** ALL are Welcome ****