Sciweavers

BTW
2007
Springer

YAWN: A Semantically Annotated Wikipedia XML Corpus

13 years 9 months ago
YAWN: A Semantically Annotated Wikipedia XML Corpus
: The paper presents YAWN, a system to convert the well-known and widely used Wikipedia collection into an XML corpus with semantically rich, self-explaining tags. We introduce algorithms to annotate pages and links with concepts from the WordNet thesaurus. This annotation process exploits categorical information in Wikipedia, which is a high-quality, manually assigned source of information, extracts additional information from lists, and utilizes the invocations of templates with named parameters. We give examples how such annotations can be exploited for high-precision queries.
Ralf Schenkel, Fabian M. Suchanek, Gjergji Kasneci
Added 07 Jun 2010
Updated 07 Jun 2010
Type Conference
Year 2007
Where BTW
Authors Ralf Schenkel, Fabian M. Suchanek, Gjergji Kasneci
Comments (0)