PhD Qualifying Examination "Word sense disambiguation vs. statistical machine translation" Miss Marine Carpuat Abstract: In this survey, we review word sense disambiguation (WSD) and statistical machine translation (SMT) literature in light of the recent WSD vs. SMT debate. WSD, the task of resolving sense ambiguity to identify the right translation of a word is one of the major challenges faced by language translation systems. If the English word "drug" translates into French as either "drogue" (used as a narcotic) or "medicament" (used as a medicine), then an English-French MT system needs to disambiguate every use of "drug" in order to make the correct translations. Heavy effort has been put in designing and evaluating dedicated WSD models, in particular with the Senseval series of workshops. This is partly motivated by the often unstated assumption that any full translation system, to achieve full performance, will sooner or later have to incorporate individual WSD components. However, in most machine translation architectures, in particular SMT, the WSD problem is typically not explicitly addressed, but the translation engine already implicitly factors in many contextual features into lexical choice. In this context, an energetically debated question at conferences over the past year is whether even the new state-of-the-art WSD models actually have anything to offer to full scale SMT systems. We will show that dedicated WSD has led to several useful insights for SMT, and present how typical SMT models perform WSD. Finally, we will discuss the main challenges for the integration of state-of-the-art dedicated WSD models in current SMT architectures. Date: Wednesday, 21 September 2005 Time: 12:00noon-2:00p.m. Venue: Room 4480 lifts 25-26 Committee Members: Dr. Dekai Wu (Supervisor) Dr. Brian Mak (Chairperson) Dr. Dit-Yan Yeung Dr. Pascale Fung (ELEC) **** ALL are Welcome ****