This course homepage is accessible from http://www.cs.ust.hk/~dlee/336/

COMP 336 Search Engines for Web and Enterprise Data [3-0-1:3]

 

Fall 2009

Course and Instructor/TA Information

Instructor: 

Prof. Dik Lun Lee

Email: 

dlee@cse.ust.hk

Office:

3534 (Lift 25/26)

Office Hours: 

Mon-Thur 1:30pm-2pm, or by email appointment

 

Lectures: 

Mon, Wed 12:00-1:20pm  [University class schedule]

Lecture Room: 

3412

 

TA: 

Victor Cheung (csvictor@cse.ust.hk)

TA Office:

4212 (Lab Area)

TA Office Hours:

2-3pm Tuesday

 

 

Lab 1A:

Thu 11:30am-12:20pm, Room 4214

Course Outline [Detailed Course Topics]

1.     Introduction and course overview

6.       Retrieval effectiveness, benchmarking

2.     Business models

7.       Document preprocessing

3.     Information retrieval models and Inverted Files

8.       Query expansion and relevance feedback

4.     Web-based information retrieval

9.       Document clustering

5.     Pattern matching and extended Boolean model

10.  Signature files

Labs and Lab Schedule

Homework, Exams, and Grading Scheme  [University Calendar]

Term Project

Text and Reference Materials

Course Description

Text retrieval models, vector space model, document ranking, performance evaluation; indexing, pattern matching, relevance feedback, clustering; web search engines, authority-based ranking; enterprise data management, content creation, metadata, taxonomy, ontology; semantic web, digital libraries and knowledge management applications.

Course Objective

After completing the course, students will have acquired:

  1. Core techniques for building search engines
  2. Technologies and business models employed in modern web-based search engines
  3. Hands-on experience in building a complete web-based search engine including spider, data storage and search modules
  4. Knowledge in the future trends and applications of information retrieval Web and Enterprise applications and digital libraries.

Pre-requisites/Background needed: COMP 151, 152 or 171

Policy on Academic Misconduct

Homework/lab assignments must be done individually. Collaboration between students is strictly forbidden. Any violation will be passed to the Department's Undergraduate/Postgraduate Studies Committee for assessment. The result may lead to dismissal from the University.

Term project must be done by the individual group. No sharing of code and copying of code from previous projects are allowed.