The Berkeley FrameNet Project (BFN) is making an English lexical database called FrameNet, which describes syntactic and semantic properties of an English lexicon extracted from l...
When implementing computational lexicons it is important to keep in mind the texts that a NLP system must deal with. Words relate to each other in many different, often queer, way...
Roberto Basili, Maria Teresa Pazienza, Paola Velar...
Two of the main corpora available for training discourse relation classifiers are the RST Discourse Treebank (RST-DT) and the Penn Discourse Treebank (PDTB), which are both based ...
Hugo Hernault, Danushka Bollegala, Mitsuru Ishizuk...
It has frequently been observed that most of the world’s data lies outside database systems. The reason is that database systems focus on structured data, leaving the unstructur...
Alon Y. Halevy, Oren Etzioni, AnHai Doan, Zachary ...
This article describes the preparation, recording and orthographic transcription of a new speech corpus, the Nijmegen Corpus of Casual French (NCCFr). The corpus contains a total ...
Francisco Torreira, Martine Adda-Decker, Mirjam Er...