Sciweavers

LREC
2010
168views Education» more  LREC 2010»
13 years 6 months ago
Balancing SoNaR: IPR versus Processing Issues in a 500-Million-Word Written Dutch Reference Corpus
In The Low Countries, a major reference corpus for written Dutch is currently being built. In this paper, we discuss the interplay between data acquisition and data processing dur...
Martin Reynaert, Nelleke Oostdijk, Orphée D...
LREC
2010
155views Education» more  LREC 2010»
13 years 6 months ago
Efficient Minimal Perfect Hash Language Models
The recent availability of large collections of text such as the Google 1T 5-gram corpus (Brants and Franz, 2006) and the Gigaword corpus of newswire (Graff, 2003) have made it po...
David Guthrie, Mark Hepple, Wei Liu
LREC
2010
143views Education» more  LREC 2010»
13 years 6 months ago
Building a Generative Lexicon for Romanian
We present in this paper an on-going research: the construction and annotation of a Romanian Generative Lexicon (RoGL). Our system follows the specifications of CLIPS project for ...
Anca Dinu
LREC
2010
129views Education» more  LREC 2010»
13 years 6 months ago
The Indiana "Cooperative Remote Search Task" (CReST) Corpus
This paper introduces a novel corpus of natural language dialogues obtained from humans performing a cooperative, remote, search task (CReST) as it occurs naturally in a variety o...
Kathleen M. Eberhard, Hannele Nicholson, Sandra K&...
LREC
2010
138views Education» more  LREC 2010»
13 years 6 months ago
Corpus Aligner (CorAl) Evaluation on English-Croatian Parallel Corpora
An increasing demand for new language resources of recent EU members and accessing countries has in turn initiated the development of different language tools and resources, such ...
Sanja Seljan, Marko Tadic, Zeljko Agic, Jan Snajde...
LREC
2010
171views Education» more  LREC 2010»
13 years 6 months ago
The Kachna L1/L2 Picture Replication Corpus
This paper presents the Kachna Corpus of Spontaneous Speech, in which ten Czech and ten Norwegian speakers were recorded both in their native language and in English. The dialogue...
Helena Spilková, Daniel Brenner, Anton &Oum...
LREC
2010
442views Education» more  LREC 2010»
13 years 6 months ago
Medefaidrin: Resources Documenting the Birth and Death Language Life-cycle
Language resources are typically defined and created for application in speech technology contexts, but the documentation of languages which are unlikely ever to be provided with ...
Dafydd Gibbon, Moses Ekpenyong, Eno-Abasi Urua
LREC
2010
130views Education» more  LREC 2010»
13 years 6 months ago
ELAN as Flexible Annotation Framework for Sound and Image Processing Detectors
Annotation of digital recordings in humanities research still is, to a large extend, a process that is performed manually. This paper describes the first pattern recognition based...
Eric Auer, Albert Russel, Han Sloetjes, Peter Witt...
LREC
2010
135views Education» more  LREC 2010»
13 years 6 months ago
A Tool for Linking Stems and Conceptual Fragments to Enhance word Access
Electronic dictionaries offer many possibilities unavailable in paper dictionaries to view, display or access information. However, even these resources fall short when it comes t...
Nuria Gala, Véronique Rey, Michael Zock
LREC
2010
149views Education» more  LREC 2010»
13 years 6 months ago
DutchParl. The Parliamentary Documents in Dutch
A corpus called DutchParl is created which aims to contain all digitally available parliamentary documents written in the Dutch language. The first version of DutchParl contains d...
Maarten Marx, Anne Schuth