Bayesian Co-Training: Concepts and Extensions

Speaker:	Dr. Shipeng YU
		Siemens Medical Solutions USA, Inc.

Title: 		"Bayesian Co-Training: Concepts and Extensions"

Date:		Monday, 31 March 2008

Time:		11:00 am - 12 noon

Venue:		Room 3315 (via lifts 17/18)
		HKUST

Abstract:

Co-training is a popular algorithm for semi-supervised classification and
has been applied to many real world problems. When the input data have
multiple representations or views (e.g. each web page has the text
representation as one view and the hyperlinks from other pages as another
view), co-training works by iteratively labeling some unlabeled data using
a classifier trained on each view, and enlarging the training set. In this
talk we present our recent work on Bayesian co-training, which is an
undirected graphical model for co-training.  The model clarifies some
previously unclear assumptions about co-training, and takes the standard
co-training and many of its extensions (e.g. co-regularization) as special
cases. A co-training kernel will also be introduced in a Gaussian process
(GP) framework, which allows efficient learning with one-step, globally
optimal solution. Extensions of Bayesian co-training will also be
discussed, which include: 1) the Bayesian co-training framework with
missing view information; 2) active view acquisition when we are allowed
to select a previously unobserved (data, view) pair to acquire such that
the overall performance is optimized. Experiments on web page
classification and some medical applications will be presented at the end
of the talk.

**********************
Biography:

Shipeng YU is currently a staff scientist at Siemens Medical Solutions
USA, Inc. He received his B.Sc. and M.Sc. degrees in mathematics from
Peking University in 2000 and 2003, respectively, and finished his Ph.D.
in computer science at University of Munich in Germany in 2006. He has
been working on many areas of statistical machine learning, such as
Gaussian processes, Dirichlet processes, probabilistic dimensionality
reduction, ordinal regression and semi-supervised learning. He is also
interested in machine learning applications in data mining, information
and image retrieval, and user modeling.