Mining Order-Preserving Submatrices Under Uncertainty: A Possible World Approach

PhD Thesis Proposal Defence


Title: "Mining Order-Preserving Submatrices Under Uncertainty: A Possible World 
Approach"

by

Mr. Ji CHENG


Abstract:

Given a data matrix D, a submatrix S of D is an order-preserving submatrix 
(OPSM) if there is a permutation of the columns of S, under which the entry 
values of each row in S are strictly increasing. OPSM mining is widely used in 
real-life applications such as identifying coexpressed genes, and finding 
customers with similar preference. However, noise is ubiquitous in real data 
matrices due to variable experimental conditions and measurement errors, which 
makes conventional OPSM mining algorithms inapplicable. No previous work has 
ever combated uncertain value intervals using the possible world semantics.

We establish two different definitions of significant OPSMs based on the 
possible world semantics: (1)expected support based and (2)probabilistic 
frequentness based. An optimized dynamic programming approach is proposed to 
compute the probability that a row supports a particular column permutation, 
and several effective pruning rules are introduced to efficiently prune 
insignificant OPSMs. These techniques are integrated into our two OPSM mining 
algorithms, based on prefix-projection and Apriori respectively. Extensive 
experiments on real microarray data demonstrate that the OPSMs found by our 
algorithms have a much higher quality than those found by existing approaches.


Date:			Tuesday, 28 July 2020

Time:                  	2:00pm - 4:00pm

Zoom Meeting:		https://hkust.zoom.us/j/8426056228

Committee Members:	Dr. Wilfred Ng (Supervisor)
  			Prof. Ke Yi (Chairperson)
 			Prof. Dik-Lun Lee
 			Dr. Qiong Luo


**** ALL are Welcome ****