Sciweavers

735 search results - page 124 / 147
» Corpora and data preparation
Sort
View
LREC
2010
155views Education» more  LREC 2010»
14 years 11 months ago
Djangology: A Light-weight Web-based Tool for Distributed Collaborative Text Annotation
Manual text annotation is a resource-consuming endeavor necessary for NLP systems when they target new tasks or domains for which there are no existing annotated corpora. Distribu...
Emilia Apostolova, Sean Neilan, Gary An, Noriko To...
LREC
2010
155views Education» more  LREC 2010»
14 years 11 months ago
Efficient Minimal Perfect Hash Language Models
The recent availability of large collections of text such as the Google 1T 5-gram corpus (Brants and Franz, 2006) and the Gigaword corpus of newswire (Graff, 2003) have made it po...
David Guthrie, Mark Hepple, Wei Liu
LREC
2010
140views Education» more  LREC 2010»
14 years 11 months ago
mwetoolkit: a Framework for Multiword Expression Identification
This paper presents the Multiword Expression Toolkit (mwetoolkit), an environment for type and language-independent MWE identification from corpora. The mwetoolkit provides a targ...
Carlos Ramisch, Aline Villavicencio, Christian Boi...
LREC
2010
175views Education» more  LREC 2010»
14 years 11 months ago
Annotation Tool for Extended Textual Coreference and Bridging Anaphora
We present an annotation tool for the extended textual coreference and the bridging anaphora in the Prague Dependency Treebank 2.0 (PDT 2.0). After we very briefly describe the an...
Jirí Mírovský, Petr Pajas, An...
LREC
2008
120views Education» more  LREC 2008»
14 years 11 months ago
Portuguese-English Word Alignment: some Experiments
In this paper we describe some studies of Portuguese-English word alignment, focusing on (i) measuring the importance of the coupling between dictionaries and corpus; (ii) assessi...
Diana Santos, Alberto Simões