Corroborate and learn facts from the web

11 years 6 months ago
Corroborate and learn facts from the web
The web contains lots of interesting factual information about entities, such as celebrities, movies or products. This paper describes a robust bootstrapping approach to corroborate facts and learn more facts simultaneously. This approach starts with retrieving relevant pages from a crawl repository for each entity in the seed set. In each learning cycle, known facts of an entity are corroborated first in a relevant page to find fact mentions. When fact mentions are found, they are taken as examples for learning new facts from the page via HTML pattern discovery. Extracted new facts are added to the known fact set for the next learning cycle. The bootstrapping process continues until no new facts can be learned. This approach is language-independent. It demonstrated good performance in experiment on country facts. Results of a large scale experiment will also be shown with initial facts imported from wikipedia. Categories and Subject Descriptors I.5.1 [Pattern Recognition]: models--st...
Shubin Zhao, Jonathan Betz
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2007
Where KDD
Authors Shubin Zhao, Jonathan Betz
Comments (0)