Nonlinear Embedding as a Tool for Big Data Algorithm Design

                        Joint Seminar
Big Data Institute and Department of Computer Science & Engineering

Speaker:        Dr. Le Song
                Associate Professor
                Department of Computational Science and Engineering
                College of Computing
                Associate Director of the Center for Machine Learning
                Georgia Institute of Technology

Title:          "Nonlinear Embedding as a Tool for Big Data Algorithm

Date:           Tuesday, 21 March 2017

Time:           11:15am to 12:15pm

Venue:          LTG (near lifts 25/26), HKUST


Many large scale data analytics problems are intrinsically hard and
complex, making the design of effective and scalable algorithms very
challenging. Domain experts have to perform extensive research, and
experiment with many trial-and-errors, in order to craft approximation or
heuristic schemes that meet the dual goals of effectiveness and
scalability. Very often, restricted assumptions about the data, which are
likely to be violated by real world scenarios, need to be made in order
for the designed algorithms to work and have performance guarantees.
Furthermore, previous algorithm design paradigms seldom systematically
exploit a common trait of real-world algorithms: instances of the same
type of problem are solved repeatedly on a regular basis using the same
algorithm, maintaining the same problem structure, but differing mainly in
their data.

In this talk, he will present a framework for scalable algorithm design
based on the idea of embedding algorithm steps into nonlinear spaces, and
learn these embedded algorithms from data via direct supervision or
reinforcement learning. In contrast to traditional algorithm design where
every steps in an algorithm is prescribed by experts, the embedding design
will delegate some difficult algorithm choices to nonlinear learning
models so as to avoid large memory requirement, restricted assumption on
the data, or limited design space exploration. I will illustrate the
benefit of this embedding framework using three examples: a materials
discovery problem, a recommendation system problem over dynamic social
networks, and a problem of learning combinatorial algorithms over graphs.
In some cases, the learned algorithms require thousands of times less
memory to run, finish hundreds of times faster, and obtain drastic
improvement in predictive performance.


Le Song is an Associate Professor in the Department of Computational
Science and Engineering, College of Computing, and an Associate Director
of the Center for Machine Learning, Georgia Institute of Technology. He
received his Ph.D. in Machine Learning from University of Sydney and NICTA
in 2008, and then conducted his post-doctoral research in the Department
of Machine Learning, Carnegie Mellon University, between 2008 and 2011.
Before he joined Georgia Institute of Technology in 2011, he was a
research scientist at Google briefly. His principal research direction is
machine learning, especially nonlinear models, such as kernel methods and
deep learning, and probabilistic graphical models for large scale and
complex problems, arising from artificial intelligence, network analysis,
computational biology and other interdisciplinary domains. He is the
recipient of the Recsys'16 Deep Learning Workshop Best Paper Award,
AISTATS'16 Best Student Paper Award, IPDPS'15 Best Paper Award, NSF CAREER
Award'14, NIPS'13 Outstanding Paper Award, and ICML'10 Best Paper Award.
He has also served as the area chair or senior program committee for many
leading machine learning and AI conferences such as ICML, NIPS, AISTATS
and AAAI, and the action editor for JMLR.