Sciweavers

45 search results - page 1 / 9
» Building a Web Corpus of Czech
Sort
View
LREC
2010
217views Education» more  LREC 2010»
13 years 5 months ago
Building a Web Corpus of Czech
Large corpora are essential to modern methods of computational linguistics and natural language processing. In this paper, we describe an ongoing project whose aim is to build a l...
Drahomíra "johanka" Spoustová, Miros...
IJCNLP
2004
Springer
13 years 9 months ago
Building a Parallel Bilingual Syntactically Annotated Corpus
This paper describes a process of building a bilingual syntactically annotated corpus, the PCEDT (Prague Czech-English Dependency Treebank). The corpus is being created at Charles...
Jan Curín, Martin Cmejrek, Jirí Have...
LREC
2010
172views Education» more  LREC 2010»
13 years 5 months ago
Evaluating Utility of Data Sources in a Large Parallel Czech-English Corpus CzEng 0.9
CzEng 0.9 is the third release of a large parallel corpus of Czech and English. For the current release, CzEng was extended by significant amount of texts from various types of so...
Ondrej Bojar, Adam Liska, Zdenek Zabokrtský
LREC
2010
158views Education» more  LREC 2010»
13 years 5 months ago
Ways of Evaluation of the Annotators in Building the Prague Czech-English Dependency Treebank
In this paper, we present several ways to measure and evaluate the annotation and annotators, proposed and used during the building of the Czech part of the Prague Czech-English D...
Marie Mikulová, Jan Stepánek
LREC
2008
95views Education» more  LREC 2008»
13 years 5 months ago
Dialogue, Speech and Images: the Companions Project Data Set
This paper describes part of the corpus collection efforts underway in the EC funded Companions project. The Companions project is collecting substantial quantities of dialogue a ...
Yorick Wilks, David Benyon, Christopher Brewster, ...