Tutorial Page for Advanced Data Mining
COMP 4332 / RMBI 4310

Thursday 6:00pm-6:50pm

Rm 2302 (Lift 17/18)

Instructor: Prof. Qiang Yang

TA: Yin Zhu (yinz@cse)

 

NEW [May 28, 2012] Final Score

 

Submission of Project/Assignment report: Send an email to yinz@cse.ust.hk with a title under format:

[4332 Project #/Assignment #] Student Number, e.g. [4332 Project 1] 07539999.

 

 

 

Week of ...

Topics

Slides & Readings

Feb 1

Tutorial Overview and Learning Python

Feb 6

Learning numpy/scipy and simple gradient descendent

Feb 13

Introducing Project 1:

     KDDCUP 2009

 

Project 1 Deadline:

     15 Mar 2012

Assignment 1 Deadline:

     25 Feb 2012

Feb 20

Classification tools

·         Slide: T4-classify.pptx

·         Code: T4-experiment.py

·         Kernel SVM: libsvm, svmlight

·         Linear SVM/LogReg: liblinear

·         LogReg: BBK

·         Naïve Bayes, Decision Trees: Weka

·         Bagging, Boosting on trees: FEST

·         Other classifiers: KNN, Neural Networks…

·         Caruana et. al: An empirical evaluation of supervised learning in high dimensions, ICML’08. Slide

·         Caruana et. al: An empirical comparison of supervised learning algorithms. ICML’06. Slide

Feb 27

Classification and evaluation

·         Slide: T5-eval.pptx

·         Code: T5-eval.py

·         Compute AUC: perf (c, note), yard (python), AUCCalculator (java)

Mar 5

Ensemble

 

Project 1 Deadline:

     18 Mar 2012

Project report on:

     25 Mar 2012

Assignment 2 Deadline:

     19 Mar 2012

·         Slide: Assignment2.pptx

·         Slide: T6-ensemble.pptx

 

·         Caruana et. al: Ensemble Selection from Libraries of Models, ICML’04.

·         D. H. Wolpert: Stacked generalization, 1990.

Mar 12

 

Reading and writing reports

 

·         Slide: T7-report.pptx

Mar 19

 

Summary of assignment 2

 

Assignment 3:

Deadline: 4 April 2012

 

·         Doc: Assignment3.docx

·         Slide: T8-ass2_3.pptx

 

 

Mar 26

Introducing Project 2

 

KDDCup 2012 Task 1

 

·         Slide: T9-proj2-kddcup2012.pptx

·         KDDCup 2012 Website

Apr 2

Spring break

 

Apr 9

 

Playing with CF tools

Discuss Assignment 3

 

Assignment 4

Deadline: 26 Apr 2012

 

·         Slide: T10-CF.pptx

Recommended packages:

·         libFM, its manual and paper

·         svdfeature, note

·         MyMediaLite, paper

Other packages:

·         Probabilistic Matrix Factorization

·         LensKit, paper

·         GraphLab