: The paper presents YAWN, a system to convert the well-known and widely used Wikipedia collection into an XML corpus with semantically rich, self-explaining tags. We introduce alg...
Ralf Schenkel, Fabian M. Suchanek, Gjergji Kasneci
The Live Memories corpus is an Italian corpus annotated for anaphoric relations. This annotation effort aims to contribute to two significant issues for the CL research: the lack ...
This paper describes SW1, the first version of a semantically annotated snapshot of the English Wikipedia. In recent years Wikipedia has become a valuable resource for both the Na...
Jordi Atserias, Hugo Zaragoza, Massimiliano Ciaram...
WikiWoods is an ongoing initiative to provide rich syntacto-semantic annotations for English Wikipedia. We sketch an automated processing pipeline to extract relevant textual cont...
Dan Flickinger, Stephan Oepen, Gisle Ytrestø...
This article presents a new freely available trilingual corpus (Catalan, Spanish, English) that contains large portions of the Wikipedia and has been automatically enriched with l...
Samuel Reese, Gemma Boleda, Montse Cuadros, Llu&ia...