Assistant Professor
Department of Computer Science and Engineering (CSE)
the Hong Kong University of Science and Technology (HKUST)
Office:
¡@
Rm 3542 (via Lift No. 25-26)
Department of Computer Science and
Engineering
The Hong Kong University of Science and Technology
Clear Water Bay, Kowloon, Hong Kong
¡@
He
is now looking for postgraduate students interested in research areas of
data mining, databases and privacy. Please contact him with your resume,
including your academic background and the scores of each subject in
your undergraduate study, if you are
interested.
Raymond
Chi-Wing Wong is an Assistant Professor
in Computer Science and Engineering
(CSE) of The Hong Kong University of Science
and Technology (HKUST). He received the BSc, MPhil and PhD
degrees in Computer Science and Engineering in the Chinese University of
Hong Kong (CUHK) in 2002, 2004 and 2008, respectively. In 2004-2005, he worked as a
research and development assistant
under an R&D project funded by ITF and a local industrial company called
Lifewood.
He received 19 awards. Within 6 years, he published
24 conference papers (e.g., SIGKDD, VLDB, ICDE and ICDM) and 11
journal/chapter papers (e.g., TODS, DAMI, TKDE and VLDB journal). He reviewed
papers from conferences and journals related to data mining and database,
including VLDB conference, SIGMOD, VLDB Journal, TKDE, TKDD, ICDE, SIGKDD, ICDM, DAMI, DaWaK, PAKDD,
EDBT and IJDWM. He is a program committee member of conferences, including
VLDB, CIKM, DASFAA, SDM and APWeb-WAIM, and a referee of journals, including
TODS, TKDE, TKDD, DAMI and
KAIS. He also gave presentations in international conferences such as
VLDB09, VLDB08, VLDB07, SIGKDD07, SIGKDD06, ICDM05, SDM05, PAKDD04 and ICDM03.
His research interests include database, data
mining, security, data mining for business applications, customer-centric
data analysis, data warehouse, decision making, data streams, video compression and computer music. The research topics
he worked include the following.
Privacy Preservation
(published in ICDE10, TODS09, ICDE09, VLDB08, TKDE08, VLDB07, SIGKDD06, ICDM05 and DaWaK06)
Skyline in Spatial Database
(published in VLDB09, EDBT09, VLDB08, TKDE09 and SIGKDD07)
Nearest Neighbor
(published in VLDB09 and VLDB07)
Customer-Centric Data Mining (published
in ICDE10, VLDB09, TODS09, EDBT09, ICE09, VLDB08, SIGKDD07, DAMI06, SDM05, ICDM05, DAMI05, PAKDD04 and ICDM03)
Data Warehouse (published in VLDB08,
TKDE09 and SIGKDD07)
Data Stream
(published in DAMI06 and SDM05)
Association Rule Mining
(published in ICDM03, DAMI05 and PAKDD04)
Frequent Pattern Mining
(published in ICDM03, DAMI05 and PAKDD04)
Social Network Mining
(published in PAKDD04)
Clustering/Outlier
(published in TKDE05)
Time Series
(published in VLDB Journal)
Wireless Network (published in
INFOCOM08)
Rate Control over Video Streams (published in ICIP07)
Entropy Coding (published in ICASSP07)
He also worked in different universities and
research centers as follows.
Visiting Scholar at University of Waterloo
(Jun 2008-Jul 2008)
To conduct research with Prof. Tamer Ozsu at University of Waterloo
¡@
Summer Intern at IBM T.J. Watson Research
Center (Aug 2007-Sept 2007)
To conduct research with Prof. Philip S. Yu at IBM T.J. Watson
Research Center in New York
¡@
Research Visitor at SFU (Simon Fraser
University) (May 2006-Aug 2006)
To conduct research with two professors, Prof. Jian Pei and Prof. Ke Wang,
at Simon Fraser University in Canada
¡@
Research Assistant at HKU ETI (E-Business
Technology Institute) (May 2005-July 2005)
To do the research work on "Drill Process Monitoring System ¡V Drill Depth
Calculation"
¡@
Research and Development Assistant at CUHK CSE (Computer
Science and Engineering) (Aug 2004-April 2005)
To do the R&D work with a local company called Lifewood on data mining in
industrial funded projects
¡@
Summer Research Assistant at HKU ETI(E-Business
Technology Institute) (July-Aug 2001)
To do a research project about e-business and computer security
During his master degree and his doctor degree at the Chinese University of
Hong Kong, being a teaching assistant (TA), he taught undergraduate
courses such as "Introduction to Database Systems" and postgraduate courses
such as "Advanced Topics in Database Systems". The score of the course
evaluation was 5.57/6.0. He also obtained Excellent TA award of CSE in two
courses "Introduction to Database Systems" and "Advanced Topics in Database
Systems".
His teaching experience is shown as follows.
University Course Tutor at CUHK CSE (Computer Science and
Engineering) (Aug 2002-July2004, Aug 2005-July2007)
To assist to teach university courses
Introduction to Database Systems (Basic Concept in Database) (2002 Fall,
2003 Fall, 2005 Fall and 2006 Fall)
Advanced Topics in Database Systems (Advanced Topics in Database) (2003
Spring, 2004 Spring, 2006 Spring and 2007 Spring) (Course Evaluation =
5.57/6.0)
¡@
Summer Teaching Assistant and Mentor at CUHK CSE ITTC (Computer
Science and Department¡¦s IT Training Centre) (July-Aug 2002)
To assist the primary/secondary students to learn some IT skills in two
courses:
Multimedia (Dream Weaver, Flash, Corel Draw, SoundForge and Powerpoint)
Computer Music (Cakewalk, Noteworthy, Band-in-a-box, Finale, AmazingMIDI,
ENCORE and PrintMusic)
¡@
Violin Tutor (2000-2001)
To teach children to learn basic violin skills
¡@
Secondary Private Tutor (1997-1999)
To help secondary students to tackle HKCEE exam in subjects Mathematics,
Physics, and Chemistry.
Raymond Chi-Wing Wong, Ada Wai-Chee Fu, Ke Wang and Jian Pei.
"Anonymization-based
Attacks in Privacy Preserving Data Publishing",
ACM Transactions on Databases Systems (TODS), Volume 34, Issue 2, Jun., 2009 (pdf)
¡@
Raymond Chi-Wing Wong, Jian Pei, Ada
Wai-Chee Fu, and Ke Wang
"Online Skyline Analysis with Dynamic Preferences on Nominal
Attributes".
IEEE Transactions on Knowledge and Data Engineering (TKDE), pp1-15, Vol. 21,
No. 1, Jan., 2009
(pdf)(code)
¡@
Raymond Chi-Wing Wong, Jiuyong Li, Ada Wai-Chee Fu and Ke Wang,
"(alpha, k)-Anonymous Data Publishing",
Journal of Intelligent Information Systems, pp209-234, Vol. 33, No. 2, Oct.,
2009 (pdf)
¡@
Jiuyong Li, Raymond Chi-Wing Wong, Ada
Wai-Chee Fu, and Jian Pei.
"Anonymisation by Local Recoding in Data with Attribute Hierarchical
Taxonomies".
IEEE Transactions on Knowledge and Data Engineering (TKDE), pp1181-1194,
Vol. 20, No. 9, Sept., 2008 (pdf)
¡@
Yingyi Bu, Raymond Chi-Wing Wong, Ada Wai-Chee Fu,
"Query by Humming",
Encyclopedia of Database Systems (accepted)
¡@
Ada Wai-Chee Fu, Eamonn Keogh, Leo Yung-Hang Lau, Chotirat Ann
Ratanamahatana, Raymond Chi-Wing Wong, "Scaling and Time Warping in Time
Series Querying",
VLDB Journal 17(4): 899-921 (2008)
(pdf)
¡@
Raymond Chi-Wing Wong and Ada Wai-Chee Fu,
"Mining Top-K Frequent Itemset
from Data Streams",
Journal of Data Mining and Knowledge Discovery,
Volume 13, Number 2, 2006, pp193-217 (DOI: 10.1007/s10618-006-0042-x)
(pdf)(code)(data)
¡@
Raymond Chi-Wing Wong, Ada Wai-Chee Fu and Ke Wang,
"Data Mining for
Inventory Item Selection with Cross-Selling Considerations",
Journal of
Data Mining and Knowledge Discovery, Volume 11, 2005, pp81-112
(pdf)(code)
¡@
Sze-Chung Ngan, Tsang Lam, Raymond Chi-Wing Wong and Ada Wai-Chee Fu,
"Mining N-most Interesting Itemsets without support threshold by the COFI-tree",
International Journal of Business Intelligence and Data Mining, Vol. 1,
No. 1, pp.88¡V106, 2005
(pdf)
¡@
Eric Ka Ka Ng, Ada Wai-Chee Fu and
Raymond Chi-Wing Wong,
"Projective Clustering by Histograms",
IEEE
Transactions on Knowledge and Data Engineering (TKDE), pp 369-383, Vol. 17, No.
3, March, 2005
(pdf)
¡@
Raymond Chi-Wing Wong and Ada Wai-Chee Fu,
"Association Rule Mining
and Its Applications to MPIS",
Chapter in Encyclopedia of Data Warehousing
and Mining, Information Science Publishing (an imprint of Idea Group
Inc.) in the Spring of 2005.
(pdf)
Conference
Raymond Chi-Wing Wong, Ada Wai-Che Fu, Jia Liu, Ke Wang and Yabo Xu
"Global Privacy Guarantee in Serial Data Publishing",
the 26th International Conference on Data Engineering (ICDE), Long Beach,
California on 1-6 March, 2010
¡@
Raymond Chi-Wing Wong, M. Tamer Ozsu, Philip S. Yu, Ada Wai-Chee Fu and
Lian Liu
"Efficient Method for Maximizing Bichromatic Reverse Nearest Neighbor",
the 35th International
Conference on Very Large Data Bases (VLDB'09), Lyon, France on 24-28 Aug, 2009
(Acceptance 97/562 = 17.26%) (pdf)(ppt) (code)
¡@
Qian Wan, Raymond Chi-Wing Wong, Ihab F. Ilyas, M. Tamer Ozsu and Yu Peng
"Creating Competitive Products",
the 35th International
Conference on Very Large Data Bases (VLDB'09), Lyon, France on 24-28 Aug, 2009
(Acceptance 97/562 = 17.26%)
(pdf)(pptx)
¡@
Xiaobing Wu, Yufei Tao, Raymond Chi-Wing Wong, Ling Ding and Jeffrey Xu Yu
"Finding the Influence Set through Skylines",
the 12th International Conference on Extending Database Technology (EDBT),
Saint-Petersburg, Russia on 23-26 March, 2009
(Acceptance 92/283 = 32.51%) (pdf)
¡@
Yabo Xu, Ke Wang, Ada Wai-Chee Fu, and Raymond Chi-Wing Wong
"FF-Anonymity: When Quasi-Identifiers Are Missing",
the 25th International Conference on Data Engineering (ICDE), Shanghai, China on 29
Mar-4 Apr, 2009
(Acceptance 147/554 = 26.53%)
(pdf)
¡@
Raymond Chi-Wing Wong, Ada Wai-Chee Fu, Jian Pei, Yip Sing Ho, Tai Wong
and Yubao Liu
"Efficient Skyline Querying with Variable User Preferences on Nominal
Attributes",
the 34th International
Conference on Very Large Data Bases (VLDB'08), Auckland, New Zealand on 24-30
Aug, 2008
(Acceptance 46/273 = 16.8%) (pdf)(ppt)(code)
¡@
Yingyi Bu, Ada Wai-Chee Fu, Raymond Chi-Wing Wong, Lei Chen and Jiuyong
Li
"Privacy Preserving Serial Data Publishing By Role Composition",
the 34th International
Conference on Very Large Data Bases (VLDB'08), Auckland, New Zealand on 24-30
Aug, 2008
(Acceptance 46/273 = 16.8%) (pdf)(ppt)(code)(source
data link)
¡@
Hong-Ning Dai, Kam-Wing Ng, Raymond Chi-Wing Wong and Min-You Wu,
"On the Capacity of Multi-Channel Wireless Networks Using Directional Antennas",
The 27th IEEE International Conference on Computer Communications (INFOCOM
2008), Phoenix, Arizona on April 13-18, 2008
(Acceptance 236/1152 = 20.5%)
(pdf)
¡@
Raymond Chi-Wing Wong, Ada Wai-Chee Fu, Ke Wang and Jian Pei.
"Minimality
Attack in Privacy Preserving Data Publishing",
the 33rd International
Conference on Very Large Data Bases (VLDB'07), Vienna, Austria on 23-28 Sept,
2007
(Acceptance 46/263 = 17.5%) (pdf)(ppt)(code)(data)
¡@
Raymond Chi-Wing Wong, Yufei Tao, Ada Wai-Chee Fu and Xiaokui Xiao.
"On
Efficient Spatial Matching",
the 33rd International Conference on Very Large
Data Bases (VLDB'07), Vienna, Austria on 23-28 Sept, 2007
(Acceptance 46/263 =
17.5%)
(pdf)(ppt)(code)
¡@
Raymond Chi-Wing Wong, Jian Pei, Ada Wai-Chee Fu and Ke Wang,
"Mining
Favorable Facets",
the Thirteenth ACM SIGKDD international conference
on knowledge discovery and data mining (KDD), San Jose, California, USA on 12-15
Aug, 2007
(Acceptance 92/513 = 17.93%)
(pdf)(ppt)(code)
¡@
Chi-Wah Wong, Oscar C. Au, Raymond Chi-Wing Wong, Hong-Kwai Lam,
"Linear
Real-time Rate Control",
HKIE Transactions, Vol. 14, Issue 1, Mar 2007
¡@
Chi-Wah Wong, Oscar C. Au, Raymond Chi-Wing Wong,
"Advanced Real-time
Rate Control in H.264",
the International Conference on Image Processing (ICIP),
San Antonio, Texas on Sept 16-19 2007
¡@
Raymond Chi-Wing Wong, Yubao Liu, Jian Yin, Zhilan Huang, Ada Wai-Chee Fu
and Jian Pei,
"(alpha, k)-anonymity Based Privacy Preservation by Lossy join",
the 8th International Conference on Web-Age Information Management, Huangshan
(Yellow Mountains), China on June 16-18 2007
(Acceptance 49/554 = 8.84%)
(pdf)
¡@
Chi-Wah Wong, Oscar C. Au, Raymond Chi-Wing Wong,
¡§Advanced
Macro-block Entropy Coding in H.264¡¨,
ICASSP 2007, Honolulu, Hawaii on April
15-20, 2007
¡@
Jiuyong Li, Raymond Chi-Wing Wong, Ada Wai-Chee Fu and Jian Pei,
"Achieving k-Anonymity by Clustering in Attribute Hierarchical Structures",
the 8th International Conference on Data Warehousing and Knowledge Discovery (DaWaK),
Krakow, Poland on 4-8 Sept, 2006
(Acceptance 52/145 = 35.9%)
(Selected as one of the 6 best papers to appear in the special
issue of International Journal of Data Warehousing and Mining)
(pdf)(code)(data)
¡@
Raymond Chi-Wing Wong, Jiuyong Li, Ada Wai-Chee Fu and Ke Wang,
"(alpha, k)-Anonymity: An Enhanced k-Anonymity Model for Privacy-Preserving Data
Publishing",
the twelfth ACM SIGKDD international conference on knowledge
discovery and data mining (KDD), Philadelphia, USA on 20-23 Aug, 2006
(Acceptance 105/457 = 23%)
(pdf)(code)(data)
¡@
Ada Wai-Chee Fu ,
Raymond Chi-Wing Wong and Ke Wang,
"Privacy-Preserving Frequent Pattern
Mining Across Private Databases"
the 2005 IEEE International Conference on
Data Mining (ICDM), Houston, Texas on November 27-30, 2005
(Acceptance 141/630 = 22.38%)
(pdf)
¡@
Raymond Chi-Wing Wong and Ada Wai-Chee Fu,
"Mining Top-K Itemsets over
a Sliding Window Based on Zipfian Distribution",
SIAM International
Conference on Data Mining, on April 21-23, 2005
(Acceptance 79/218 = 36.24%)
(pdf)(code)
¡@
Chi-Wah Wong, Oscar C. Au, Raymond Chi-Wing Wong and Hong-Kwai Lam,
"Piecewise
Linear Model for Real-Time Rate Control",
2005 IEEE International Conference
on Acoustics, Speech, and Signal Processing, Philadelphia, PA, USA, on March
19-23, 2005
¡@
Chi-Wah Wong, Oscar C. Au, Raymond Chi-Wing Wong and Hong-Kwai Lam,
"Real-Time
Rate Control Via Variable Frame Rate and Quantization Parameters",
Advances
in Multimedia Information Processing - PCM 2004: 5th Pacific Rim Conference on
Multimedia, Tokyo, Japan, on November 30 - December 3, 2004
¡@
Raymond Chi-Wing Wong and Ada Wai-Chee Fu,
"ISM: Item Selection for
Marketing with Cross-Selling Considerations",
The Eighth Pacific-Asia
Conference on Knowledge Discovery and Data Mining (PAKDD), Sydney, Australia on
May 26-28, 2004
(Acceptance 50/235 = 21.3%)
(pdf)(code)
¡@
Raymond Chi-Wing Wong, Ada Wai-Chee Fu and Ke Wang,
"Choosing
Best Items by Considering Relationships Between Items",
5th ACM Postgraduate Research Day, the Chinese University of Hong Kong on 31
Jan, 2004
¡@
Raymond Chi-Wing Wong, Ada Wai-Chee Fu and Ke Wang,
"MPIS: Maximal-Profit Item Selection
with Cross-Selling Considerations",
The 2003 IEEE International Conference on Data Mining (ICDM), Melbourne, Florida
on November 19-22, 2003
(Acceptance 58/501 = 11.6%)
(pdf)(code)
Reviewers of conferences: ICDE10, ICDM09, SIGKDD09, SDM09, DASFAA09, the ACM SIGSOFT
Symposium on the Foundations of Software Engineering (FSE) 2009, The first International
Workshop on Mobile Business Collaboration (MBC'09) to be held in conjunction
with DASFASS'09, ICDM08, PinKDD08, KDD08, SIGMOD08, SDM08, PAKDD08, EDBT08, ICDE08,
ICDM07, VLDB07, DBMAN07, PAKDD07, ICDE07, ICDM06, DaWaK06, KDD06, WAIM06,
PAKDD06, EDBT06, the IASTED International Conference on Databases and
Applications (DBA) 2005, VLDB05, ICDM04, ICDE04, ICDM03
¡@
Reviewers of journals: Proceedings of the VLDB Endowment (PVLDB) 2008, IEEE Transactions on Knowledge and Data
Engineering (TKDE) 2008, Very Large Data Bases Journal (VLDBJ) 2008, ACM Transactions on Knowledge Discovery from
Data (TKDD) 2008, Journal of Knowledge and Information Systems (KAIS) 2007, Journal of Data Mining and Knowledge Discovery (DAMI) 2007,
Journal of Knowledge and Information Systems (KAIS) 2007, IEEE
Transactions on Knowledge and Data Engineering (TKDE) 2006, Journal of Data
Mining and Knowledge Discovery (DAMI) 2006, TKDE 2005, International Journal
of Data Warehousing and Mining (IJDWM) 2005, Very Large Data Bases Journal (VLDBJ)
2005, Journal of Parallel and Distributed Computing 2005, the journal -
Systems & Control Letters 2003
¡@
Referees: ACM Transactions on Databases Systems (TODS) 2009, Very Large
Data Bases Journal (VLDBJ) 2009, IEEE Transactions on Knowledge and Data Engineering (TKDE)
2009, ACM Transactions on Knowledge Discovery from Data (TKDD) 2009, Data & Knowledge Engineering (DKE) 2009, Distributed and Parallel Databases (DAPD) 2009,
International Journal of Information Technology & Decision Making 2009, Journal of Computer Science and Technology (JCST) 2009, Technology Transfer Center (HKUST)
2009, IEEE Transactions on Knowledge and Data Engineering (TKDE)
2008, IEEE Transactions on Semiconductor Manufacturing 2008, International
Journal of Services Sciences (IJSSCI) 2008, Journal of Data Mining and Knowledge Discovery (DAMI) 2007, Journal of
Knowledge and Information Systems (KAIS) 2007, the journal - Systems & Control
Letters 2003
¡@
Program Committee: SDM 2010, DASFAA 2010, ISI10, VLDB 2009, CIKM 2009, SDM 2009, APWeb-WAIM 2009, The 10th
International Conference on Web Information Systems Engineering (WISE 2009),
The 2009 International Conference on Web Information Systems and Mining (WISM),
The 2009 International Workshop on Web Information and Data Management (WIDM) to be
held in conjunction with CIKM 2008, the first International
workshop on Web-based Contents Management Techniques (WCMT'09) to be held in
conjunction with WAIM'09/APWeb'09, the 1st International Workshop on Knowledge
Discovery in Web 2.0 Environments (KDWeb2 2009) in conjunction with NCM 2009
conference, Post-Mining of Association Rules: Techniques for
Effective Knowledge Extraction 2008
¡@
Chair: Poster Co-chair of CIKM 2009, Co-chair of the first International Workshop on Privacy-Preserving Data
Analysis (PPDA'09) to be held in conjunction with DASFAA'09,
Activity Chair of IEEE (Hong Kong) Computational Intelligence Chapter
¡@
Editor Review Board: Editorial Review Board (ERB) of the International
Journal of Systems and Service-Oriented Engineering (IJSSOE)
¡@
Session Chair: VLDB 2009, CIKM 2009, WIDM 2009
¡@
Poster Boaster Session Chair: CIKM 2009
¡@
Student Helper of conferences: ICDM06, KDD06, VLDB02
¡@
Some Administrative Work of conferences: ICDM03
¡@
Topic: "CIKM 2009 Poster Boaster (IR Track)"
Venue: CIKM'09, Hong Kong
Date: Nov 2-6, 2009
¡@
Topic: "Efficient Method for Maximizing Bichromatic Reverse Nearest Neighbor"
Venue: VLDB'09, Lyon, France
Date: Aug 24-28, 2009
¡@
Topic: "Efficient Skyline Querying with Variable User Preferences on Nominal
Attributes"
Conference: VLDB'08, Auckland, New Zealand
Date: Aug 24-30, 2008
¡@
Topic: "Privacy Preserving Serial Data Publishing By Role Composition"
Conference: VLDB'08, Auckland, New Zealand
Date: Aug 24-30, 2008
¡@
Topic: "Minimality Attack in Privacy Preserving Data Publishing"
University: University of Waterloo, Waterloo, Canada
Date: July 18, 2008
¡@
Topic: "On Efficient Spatial Matching"
University: Simon Fraser University, Vancouver, Canada
Date: Jun 13, 2008
¡@
Topic: "Minimality Attack in Privacy Preserving Data Publishing"
Conference: VLDB07, Vienna, Austria
Date: Sept 23-28, 2007
¡@
Topic: "Mining Favorable Facets",
Conference: KDD07, San Jose, California, USA
Date: Aug 12-15, 2007
¡@
Topic: ¡§(alpha,k)-Anonymity: An Enhanced k-Anonymity Model for Privacy
Preserving¡¨
Conference: SIGKDD 2006, Philadelphia, USA
Date: Aug 20-23, 2006
¡@
Topic: ¡§Privacy-Preserving Frequent Pattern Mining Across Private Databases¡¨
Conference: ICDM05, Houston, Texas
Date: November 27-30, 2005
¡@
Topic: ¡§Mining Top-K Itemsets over a Sliding Window Based on Zipfian
Distribution¡¨
Conference: SDM05, New Port, California, USA
Date: April 21-23, 2005
¡@
Topic: "ISM: Item Selection for Marketing with Cross-Selling Considerations"
Conference: PAKDD04, Sydney, Australia
Date: May 26-28, 2004
¡@
Topic: "Choosing Best Items by Considering Relationships Between Items"
Research Day: 5th ACM Postgraduate Research Day, the Chinese University of
Hong Kong
Date: 31 Jan, 2004
¡@
Topic: "MPIS: Maximal-Profit Item Selection with Cross-Selling
Considerations",
Conference: ICDM03, Melbourne, Florida
Date: November 19-22, 2003
¡@
Basically, I have various research interests. Here is a brief summary of my
research statement to summarize some of my research topics with
their contributions and impacts.
¡@
1.
Knowledge Discovery
Applications of Data Mining
During my master degree, I focused on studying the "utility" of data mining.
Most traditional data mining techniques find many different "patterns". For
example, association rules and clustering are typical patterns in data mining.
However, these patterns cannot directly be used for decision making which is
the main objective of data mining. The focus of my research work about
knowledge discovery is how to utilize these kinds of traditional patterns in
data mining. More specifically, in my work published in ICDM03, association
rule mining which are studied extensively in the literature of data mining
aims at understanding the relationship among items in a basket analysis. The
focus of our work is to find a set of items in order to maximize the profit of
the company with the use of the traditional patterns in applications of
inventory control and marketing. The concept of the "utility" of data mining
gives a lot of opportunities to researchers to work on many new problems which
utilize the traditional data mining patterns.
¡@
Data Streams
After I graduated my master degree, I continued working on top-K itemsets
mining over data streams and published papers in Journal of Data Mining and
Knowledge Discovery. In the literature of data mining, association rule mining
is very popular because of the ease of understanding the relationship among
items. However, it suffers from a major drawback of setting a "magic" number
of the user-defined threshold which is used to determine whether an
association rule is "interesting" or not. Setting this magic number depends on
the characteristics of data. Setting too high gives no association rules at
all but setting too low gives abundant association rules. Setting a reasonable
threshold is troublesome because it may involve a lot of steps for "trying".
These "trying" steps can only be used in the static data but they cannot be
used over data streams because, in data streams, all data can be read once
only and cannot be read back again. From the perspective of human, the number
of "interesting" association rules should be roughly equal to a certain size K
given by the users. In this way, the users can just give a parameter K and K
"interesting" association rules can be returned as desired. My work focuses on
mining top-K association rules (or simply frequent patterns) over data
streams. It is different from all "magic" number dependent techniques about
association rule mining over data streams. The utilization of "top-K" is not
only meaningful to human but also useful with data streams where the data
characteristics are unknown.
¡@
2.
Database Queries
Skyline Queries
One of my recent research works published in KDD07 is skyline queries. All
traditional skyline queries are based on totally-ordered numeric attributes.
However, in many existing applications, there are some categorical attributes
in which the ordering or preference of values is different with different
users. For example, when we select air flights, some categorical attributes
are airline and transition airport. Due to the inability of the existing
techniques to solve this new problem, we propose some methods to handle it.
Since our proposed problem is much general than the traditional problems, all
variations of the traditional problems such as finding skyline over data
streams and find skyline with respect to any subspace can be some of the
potential research problems when categorical attributes are considered.
¡@
Spatial Matching Queries
In my recent work published in VLDB07, I proposed a spatial matching problem
which is a general problem of an extensively studied problem in the literature
of spatial database, reverse nearest neighbor. My major contribution in this
paper is that this work brings the researchers an attention to consider the
maximum serving capacity of each service in all customer-service applications
which are not considered in all existing works related to reverse nearest
neighbor. We prove that our spatial matching problem is a general problem of (bichromatic)
reverse nearest neighbor. Thus, all existing works related to reverse nearest
neighbors can also be extended with the consideration of the capacities of
services. Some examples are finding reverse nearest neighbors in real time and
finding reverse nearest neighbors of moving objects with the consideration of
the capacities.
3.
Privacy Issues
One of my research topics is privacy. Publishing sensitive data is an
important topic in the literature of privacy. The major task of data
publishing is to release data which can protect individual privacy. At the
same time, the "utility" of the released data can be kept as high as possible
and thus the "distortion" or "information loss" of the released data should be
minimized. My recent work published in VLDB07
points out that individual privacy breaches when the minimality principle is
used in the anonymization. Since all existing works rely on this principle,
all published tables defined by the existing works suffer from privacy
breaches. Thus, since all existing works are affected, their works should be
re-done with the consideration of the minimality principle during data
publishing in order to protect individual privacy.
¡@
Raymond Wong, Raymond C.-W. Wong,
Raymond C. W. Wong,
Raymond C. Wong, R. C.-W. Wong, R. C. W. Wong, R. C. Wong, Chi-Wing Wong, C.-W. Wong,
Chi Wing Wong, C. W. Wong, Raymond Chi-Wing Wong