The paper describes the project held within Russian National Corpus (http://www.ruscorpora.ru). Beside such obligatory constituents of a linguistic corpus as POS (parts of speech)...
The InFile project (INformation, FILtering, Evaluation) is a cross-language adaptive filtering evaluation campaign, sponsored by the French National Research Agency. The campaign ...
This paper describes the use of the CasSys platform in order to achieve the chunking of conversational speech transcripts by means of cascades of Unitex transducers. Our system is...
The Manually Annotated Sub-Corpus (MASC) project provides data and annotations to serve as the base for a communitywide annotation effort of a subset of the American National Corp...
Nancy Ide, Collin F. Baker, Christiane Fellbaum, R...
Large corpora are essential to modern methods of computational linguistics and natural language processing. In this paper, we describe an ongoing project whose aim is to build a l...