A Survey on Canonicalization of Open Knowledge Bases

PhD Qualifying Examination

Title: "A Survey on Canonicalization of Open Knowledge Bases"


Miss Xueling LIN


Nowadays Open Information Extraction (Open IE) approaches, which extract 
 triples from unstructured 
text, contribute to the construction of large Open Knowledge Bases (Open 
KBs). However, one crucial problem for Open IE approaches is that the noun 
phrases and relation phrases in the extracted triples are not well 
canonicalized, i.e., there are a large amount of redundant and ambiguous 
facts. For example,  and  will be extracted and stored in the Open 

In this survey, we provide a detailed overview of the various approaches 
that are proposed to perform canonicalization over triples in Open KBs. 
These approaches convert the triples into a canonicalized form, where 
entity and relation names are mapped to canonical clusters. We present the 
categories and evolution of such suggested approaches over time and depict 
the specific issues they address. In addition, we introduce the commonly 
applied evaluation metrics for assessing the performance of the 
canonicalization over Open KB triples. Finally, we highlight some 
directions for future work.

Date:			Tuesday, 11 December 2018

Time:                  	2:00pm - 4:00pm

Venue:                  Room 5501
                         Lifts 25/26

Committee Members:	Prof. Lei Chen (Supervisor)
 			Prof. Bo Li (Chairperson)
 			Dr. Yangqiu Song
 			Dr. Tao Wang

**** ALL are Welcome ****