Sciweavers

5 search results - page 1 / 1
» Tagging the Dutch PAROLE Corpus
Sort
View
CLIN
2001
13 years 6 months ago
Tagging the Dutch PAROLE Corpus
We discuss the annotation with part of speech and lemma of the Dutch PAROLE Internet Corpus. The PAROLE PoS tagger is a combination of statistical taggers. It includes the Markov ...
Jesse de Does, John van der Voort van der Kleij
CLIN
2000
13 years 6 months ago
Syntactic Annotation for the Spoken Dutch Corpus Project (CGN)
Of the ten million words of contemporary standard Dutch in the Spoken Dutch Corpus (Corpus Gesproken Nederlands, CGN), a selection of one million words of natural spoken language ...
Heleen Hoekstra, Michael Moortgat, Ineke Schuurman...
LREC
2010
147views Education» more  LREC 2010»
13 years 6 months ago
Interacting Semantic Layers of Annotation in SoNaR, a Reference Corpus of Contemporary Written Dutch
This paper reports on the annotation of a corpus of 1 million words with four semantic annotation layers, including named entities, coreference relations, semantic roles and spati...
Ineke Schuurman, Véronique Hoste, Paola Mon...
LREC
2008
131views Education» more  LREC 2008»
13 years 6 months ago
From D-Coi to SoNaR: a reference corpus for Dutch
The computational linguistics community in The Netherlands and Belgium has long recognized the dire need for a major reference corpus of written Dutch. In part to answer this need...
Nelleke Oostdijk, Martin Reynaert, Paola Monachesi...
CLIN
2003
13 years 6 months ago
Methods for the Extraction of Hungarian Multi-Word Lexemes
This paper describes an experiment on extracting Hungarian multi-word lexemes from a corpus, using statistical methods. Corpus preparation—the addition of POS tags and stems—w...
Balázs Kis, Begoña Villada, Gosse Bo...