Agenda in Review

The Big Data and AI Day is inaugurated by the HKUST School of Engineering and Big Data Institute (BDI), to provide a platform to showcase BDI’s research and educational efforts, and to facilitate interaction and strengthen exchange among industries and researchers.

Friday, 26 May 2017, 9:00am – 5:30pm

Venue: Lecture Theatre J, HKUST

Morning session

09:00 - 09:10am      Opening remarks
09:15 - 09:45am      Big Visual Data Analysis
Dr. Tieniu Tan, Vice Minister of Liaison Office of the Central People’s Government in the Hong Kong S.A.R.

     Prof Tieniu Tan is an expert on image processing, computer vision and pattern recognition.
     He is currently Vice Minister at the Liaison Office of the Central People’s Government in the HK S.A.R. and Director of the Centre for Research on Intelligent Perception and Computing, CAS Institute of Automation (CASIA).
     He is a member of the Chinese Academy of Sciences, International Fellow of the Royal Academy of Engineering, Fellow of The World Academy of Sciences for the advancement of sciences in developing countries (TWAS) and Corresponding Member of the Brazilian Academy of Sciences. He is also Fellow of IEEE and IAPR.
     He received his BSc degree in electronic engineering from Xi'an Jiaotong University in 1984, and later gained MSc and PhD degrees in electronic engineering from Imperial College London.
     From 1989, he joined the Computer Science Department at the University of Reading, before returning to China in 1998 to join the National Laboratory of Pattern Recognition (NLPR), CASIA. He was the Director General of CASIA from 2000-2007, the Director of the NLPR from 1998-2013, Deputy Secretary -General of CAS from 2007-2015, and Vice President of CAS from 2015-2016. He has been a professor at CASIA since 1998.
     He has published more than 500 research papers in refereed international journals and conferences, and has authored or edited 11 books.

09:45 - 10:15am      Big Data at DiDi Chuxing

     Didi Chuxing is the world’s leading mobile transportation platform that offers a full range of mobile tech-based mobility options for nearly 400 million users across more than 400 Chinese cities. Every day, Didi's platform generates over 70TB worth of data, processes more than 9 billion routing requests, and produces over 13 billion location points. This talk is about how AI technologies have been applied to analyze such big transportation data to improve the travel experience for millions of people in China.

Dr. Jieping Ye, Vice President of Didi Research, Didi Chuxing

     Dr. Jieping Ye is the Vice President of DiDi Research, and is also an associate professor of University of Michigan. His research interests include big data, machine learning, and data mining with applications in transportation and biomedicine. He has served as a Senior Program Committee/Area Chair/Program Committee Vice Chair of many conferences including NIPS, ICML, KDD, IJCAI, ICDM, SDM, ACML, and PAKDD. He serves as an Associate Editor of Data Mining and Knowledge Discovery, IEEE Transactions on Knowledge and Data Engineering, and IEEE Transactions on Pattern Analysis and Machine Intelligence. He won the NSF CAREER Award in 2010. His papers have been selected for the outstanding student paper at ICML in 2004, the KDD best research paper honorable mention in 2010, the KDD best research paper nomination in 2011 and 2012, the SDM best research paper runner up in 2013, the KDD best research paper runner up in 2013, and the KDD best student paper award in 2014.

10:45 - 11:15am      Big Data Software: What’s Next?

     The Big Data revolution has been enabled in part by a wealth of innovation in software platforms for data storage, analytics, and machine learning. The first wave of Big Data platforms such as Hadoop and Spark focused on scalability, fault-tolerance and performance. As these and other systems increasingly become part of the mainstream, the next set of challenges are becoming clearer. Requirements for performance are changing as workloads evolve to include techniques such as hardware-accelerated deep learning. But more fundamentally, other issues are moving to the forefront. These include ease of use for a wide range of users, security, concerns about privacy and potential bias in results, and the perennial problem of data integration from heterogeneous sources.
     In this talk, I will give a quick overview of how we got here, with an emphasis on the development of the Apache Spark system. I will then focus on these emerging issues and approaches towards tackling them.

Prof Michael Franklin, Liew Family Chair of Computer Science, University of Chicago

     MICHAEL J. FRANKLIN is the Liew Family Chair of Computer Science and Sr. Advisor to the Provost for Computation and Data at the University of Chicago where his research focuses on database systems, data analytics, data management and distributed computing systems. Franklin previously was the Thomas M. Siebel Professor and Chair of the Computer Science Division of the EECS Department at the University of California, Berkeley. He co-founded and directed Berkeley’s Algorithms, Machines and People Laboratory (AMPLab), which created industry- changing open source Big Data software such as Apache Spark and BDAS, the Berkeley Data Analytics Stack. At Berkeley he also served as an executive committee member for the Berkeley Institute for Data Science. He currently serves as a Board Member of the Computing Research Association and on the NSF CISE Advisory Committee. Franklin is an ACM Fellow, a two-time recipient of the ACM SIGMOD “Test of Time” award and received the Outstanding Advisor award from Berkeley’s Computer Science Graduate Student Association.

11:15 - 11:40am      Overview and achievements of Big Data Institute
11:40am - 12:00nn Speech and Language for AI and Data Analytics
Prof Pascale Fung, Professor of Department of Electronic and Computer Engineering, HKUST

     Pascale Fung is a professor in the Department of Electronic and Computer Engineering at HKUST and the founding director of InterACT@HKUST, a joint research and education center with Carnegie Mellon University. Pascale is a leading researcher in the fields of statistical speech, language, and music processing. She cofounded a company that launched the first Chinese natural language search engine in 2000 and her entrepreneur story was featured in the Wall Street Journal, CNBC, and other magazines. Her second company launched the first Chinese virtual personal assistant on a smartphone in 2009 and the first Chinese language automobile infotainment system with a 3G connection in 2010. Her company also invented a world-leading intelligent music technology engine that powers the digital music services to more than fifty million Chinese users. She has published more than 130 papers and book chapters and holds fifteen world-wide and Chinese patents and software copyrights. In addition to core engineering subjects, Pascale also teaches technology entrepreneurship to engineering students at HKUST. In 2011 she cofounded the Women Faculty Association at HKUST to push for diversity in academia. Pascale received her PhD in computer science from Columbia University in 1997.

12:00nn - 12:40pm Student Presentations
WeChat Lab

  • Yuxiang WU, CSE (Machine Reading System)
  • Kaixiang MO and Mingfei SUN, CSE (Dialog System)
  • Quan LI, CSE (Moments Articles Real Time Propagation and Visualization System)
  • Kejie QIU, ECE (Model-based Global Localization for Aerial Robots using Edge Alignments)
  • Li CHEN, CSE (Enabling Datacenter-Scale Deep Learning)

Smart City

  • Jiahang CHEN (Hong Kong Housing Data Portal and Analysis)
  • Ka Wai TONG, Dennis (Hong Kong Traffic Analysis and Prediction)
  • Zechun WU (Hong Kong Listed Company Analysis and Ranking)
  • Xu GENG (Ridesourcing Car Detection by Transfer Learning)


  • Xiong Feng (Video crowd counting via Convolutional LSTM)
  • Zhu Chengkai (Transfer Visual Sentiment via Emotion Distribution Recognition from Facial Expression)
  • ZHANG Chong (Learning Multisensory Cue Integration on Mobile Robots)
  • XI Wei (Sensorless Sensing for Crowd Counting and Flows Prediction using WiFi)

MSc BDT Students

  • GOU Renjie (Car Sales Prediction)
  • Chung, Kin Fung (SAIC Car Type Prediction)
  • GAO Cong (House Price Prediction)
  • XU Yujie (US Presidential Prediction)
  • LIU Wei (Blockchain)

Civil Group

  • Dynamic IoT, Big Data and AI: Data-driven Predictive Maintenance for Smart Resilient Cities (OOI Ghee Leng, TAN Pin Siang, LAU Yun Man Oscar, SU Zhaoyu Tony, Jimmy WU, LEUNG Mei Ling, LUI Hoi Lun Henry)

Afternoon session

2:00 – 2:20pm        Human-Powered Machine Learning

     Recently, machine learning becomes quite popular and attractive, not only to academia but also to the industry. The successful stories of machine learning on Alpha-go and Texas hold 'em games raise significant interests on machine learning. The question is whether machine learning can do everything perfect? In this talk, I will first give several examples that current machine learning techniques have difficulty to perform well. Then, I will show by putting human in the machine-learning loop, the results can be significantly improved. After that, I will discuss the challenges and opportunities for this human-powered machine learning paradigm.

Prof Lei Chen, Professor of Department of Computer Science & Engineering, Associate Director of Big Data Institute, HKUST

     Lei Chen received the BS degree in computer science and engineering from Tianjin University, Tianjin, China, in 1994, the MA degree from Asian Institute of Technology, Bangkok, Thailand, in 1997, and the PhD degree in computer science from the University of Waterloo, Canada, in 2005. He is currently a full professor in the Department of Computer Science and Engineering, Hong Kong University of Science and Technology. His research interests include human-powered machine learning, crowdsourcing , social media analysis, probabilistic and uncertain databases, and privacy-preserved data publishing. The system developed by his team won the excellent demonstration award in VLDB 2014. He got the SIGMOD Test-of-Time Award in 2015. He is PC Track chairs for SIGMOD 2014, VLDB 2014, ICDE 2012, CIKM 2012, SIGMM 2011. He has served as PC members for SIGMOD, VLDB, ICDE, SIGMM, and WWW. Currently, he serves as Editor- in-Chief of VLDB Journal and an associate editor-in-chief of IEEE Transaction on Data and Knowledge Engineering. He is a member of the VLDB endowment.

2:20 – 2:50pm        Anomaly detection in large graphs

     Given a large graph, like who-calls-whom, or who-likes-whom, what behavior is normal and what should be surprising, possibly due to fraudulent activity? How do graphs evolve over time?
     We focus on these topics:
          (a) anomaly detection in large static graphs; and
          (b) patterns and anomalies in large time-evolving graphs.
     For the first, we present a list of static and temporal laws, including advances patterns like 'eigenspokes'; we show how to use them to spot suspicious activities, in on-line buyer-and-seller settings, in FaceBook, in twitter-like networks.
     For the second, we show how to handle time-evolving graphs as tensors, as well as some surprising discoveries such settings.

Prof Christos Faloutsos, Professor in Department of Computer Science, Carnegie Mellon University

     Christos Faloutsos is a Professor at Carnegie Mellon University. He has received the Presidential Young Investigator Award by the National Science Foundation (1989), the Research Contributions Award in ICDM 2006, the SIGKDD Innovations Award (2010), 24 ``best paper'' awards (including 5 ``test of time'' awards), and four teaching awards.
     Six of his advisees have attracted KDD or SCS dissertation awards, He is an ACM Fellow, he has served as a member of the executive committee of SIGKDD; he has published over 350 refereed articles, 17 book chapters and two monographs. He holds seven patents (and 2 pending), and he has given over 40 tutorials and over 20 invited distinguished lectures.
     His research interests include large-scale data mining with emphasis on graphs and time sequences; anomaly detection, tensors, and fractals.

2:50 – 3:20pm        AI-Powered Information Creation, Distribution and Interaction

     In the mobile era, we are being presented an exciting opportunity to shape the way people acquire and consume information. We believe that AI will fundamentally change the way people connect with information, and we can use AI to improve the effectiveness and efficiency in the entire process of content creation, moderation, dissemination, consumption, and interaction. By closing human feedback loop in this entire process, we can also enable human and AI algorithms collaboratively evolve and improve. Based on this vision, Toutiao was started 5 years ago and it recommends information tailored to users’ likes and interests. To date, it serves 100M daily active users and their average use time is over 76 minutes per day. In this talk, we will introduce the roles of AI technologies in information consumption platforms. We will share several recent research results at Toutiao AI Lab towards more efficient information creation and interaction. We will introduce a robot writer, Xiaomingbot, which has produced 5000 articles since August 2016. We will present a deep-learning based system that answers factoid questions with the state-of-the-art accuracy.

Dr. Hongjiang Zhang (on behalf of Dr. Lei Li at Toutiao AI Lab)

     Dr. HongJiang ZHANG, retired on Dec 1, 2016, was an Executive Director and the chief executive officer (“CEO”) of Kingsoft (a Hong Kong listed public company), from November 2011 to November 2016. He was also a director and the CEO of Kingsoft Cloud, and a director of Cheetah Mobile Inc. (NYSE: CMCM) a subsidiary of Kingsoft. Dr. ZHANG was also a director of Xunlei Limited (NASDAQ: XNET) and a director of 21Vianet Group, Inc. (NASDAQ: VNET).
     Before Kingsoft, he was the chief technology officer for Microsoft Asia-Pacific Research and Development Group (ARD) and the managing director of the Microsoft Advanced Technology Center (ATC). In his dual role, Dr. ZHANG led Microsoft’s research and development agenda in China, including strategy, planning, R&D and incubation for products, services and solutions. Dr. ZHANG was also a member of Executive Management Committee of Microsoft (China) Limited, a committee that defines and leads Microsoft’s strategy and business development in the Greater China region.
     Dr. ZHANG was an Assistant Managing Director and a founding member of Microsoft Research Asia. His outstanding leadership and achievement, illustrated by the high impact he made in academia and Microsoft’s products, was critical in establishing Microsoft Research Asia into a world class basic research center in computer science, and a technology powerhouse in Microsoft, and has made him one of the 10 Microsoft Distinguished Scientists.
     As a Fellow of the Institute of Electric and Electronic Engineers (IEEE) and Association of Computing Machines (ACM), Dr. ZHANG is well known in the research community for his leadership in media computing and his pioneering work in video and image content analysis and search. He was the recipient of the 2010 IEEE Computer Society Technical Achievement Award, 2012 ACM SIGMM Outstanding Technical Achievement Award, and the winner of 2008 "Asian-American Engineer of the Year" award. He holds close to 200 US and international patents, and has authored four books and over 400 scientific papers, many of which have become classic references in their respective research areas. He is one of the most sited researcher in computer science (H-Index=112).
     Dr. ZHANG received a Ph.D. in Electrical Engineering from the Technical University of Denmark, and a Bachelor of Science degree from Zhengzhou University, China. Prior to joining Microsoft, Dr. ZHANG was a research manager at Hewlett-Packard Labs at Palo Alto, CA. He also worked at the Institute of Systems Science, National University of Singapore.

3:40 – 4:00pm        Impact of Social Media and AI on the Financial Market
Prof Michael Zhang, Associate Professor, Department of Information Systems, Business Statistics and Operations Management, HKUST

     Professor Michael Zhang is an Associate Professor of Information Systems, Business Statistics and Operations Management at the Hong Kong University of Science and Technology, and an affiliated faculty at MIT Center for Digital Business. He holds a PhD in Management from MIT Sloan School of Management, an MSc in Management, a BE in Computer Science and a BA in English from Tsinghua University. Before joining the academia, he worked as an analyst for an investment bank, and as an international marketing manager for a high-tech company. He holds a US patent, and cofounded a social-network company.
     Professor Zhang’s research interests are on issues related to creation, dissemination and processing of information in business and management contexts. His works study pricing of information goods, online word-of-mouth, online advertising, incentives of creation in open source and open content projects, and use of information in financial markets. His research has appeared in American Economic Review, Management Science, Journal of Marketing, MIS Quarterly, Information Systems Research, Journal of MIS, Decision Support Systems, and Journal of Interactive Marketing.
     He has been actively involved in professional services, including serving as a Senior Editor for Information Systems Research, an Associate Editor for Management Science, a Guest Associate Editor for MIS Quarterly. He was on the editorial boards of Production and Operations Management and Electronic Commerce Research and Applications. Professor Zhang is also actively contributing to the society by assuming positions such as advisor for Hong Kong Cyberport Entrepreneurship Center, Alibaba Group’s Lakeside (Hupan) University, Huawei, China Mobile, China Merchants Securities,, and Radica Systems.

4:00 – 4:20pm        Looking for something that leads exponential growth with big data and AI

     While Moore's Law is coming to an end, can big data and AI support sustained economic development next? To do that, you need to create something that grows exponentially with big data and AI.

Dr. Masayuki Mizuno, Deputy General Manager, Data Science Research Laboratories, NEC Corp.

     20-year R&D career in the semiconductor industry of NEC and Renesas Electronics. He has created an array of products based on R&D, e.g. high-speed, low-power, highly reliable, and safety IPs for vector supercomputer, world's first MPEG-2 encoder chips, ADAS image recognition chips and HEV/EV isolation driver chips for automotive.
     Came in 2nd place among industry researchers for the total number of technical papers at the 60th anniversary of ISSCC (ISSCC2013). He was the program chair of 2009 Symposium on VLSI Circuits and the symposium chair at 2011. 3 years of business incubation and corporate technology strategic planning and management experience, reporting directly to the CEO and CTO at Renesas Electronics.
     He returned to NEC in November 2016 and is now engaged in Deputy GM of Data Science Research Laboratories and a visiting professor of the University of Tokyo.

5:20 - 5:30pm        Closing Remarks by Prof Yang Wang