Sciweavers

LREC
2008
119views Education» more  LREC 2008»
13 years 7 months ago
Corpus and Voices for Catalan Speech Synthesis
In this paper we describe the design and production of Catalan database for building synthetic voices. Two speakers, with 10 hours per speaker, have recorded 10 hours of speech. T...
Antonio Bonafonte, Jordi Adell, Ignasi Esquerra, S...
LREC
2008
101views Education» more  LREC 2008»
13 years 7 months ago
Glossa: a Multilingual, Multimodal, Configurable User Interface
We describe a web-based corpus query system, Glossa, which combines the expressiveness of regular query languages with the user-friendliness of a graphical interface. Since corpus...
Lars Nygaard, Joel Priestley, Anders Nøkles...
LREC
2008
124views Education» more  LREC 2008»
13 years 7 months ago
Annotating an Arabic Learner Corpus for Error
This paper describes an ongoing project in which we are collecting a learner corpus of Arabic, developing a tagset for error annotation and performing Computer-aided Error Analysi...
Ghazi Abuhakema, Reem Faraj, Anna Feldman, Eileen ...
LREC
2008
108views Education» more  LREC 2008»
13 years 7 months ago
Creating a Research Collection of Question Answer Sentence Pairs with Amazon's Mechanical Turk
Each year NIST releases a set of question, document id, answer-triples for the factoid questions used in the TREC Question Answering track. While this resource is widely used and ...
Michael Kaisser, John Lowe
LREC
2008
82views Education» more  LREC 2008»
13 years 7 months ago
An eRulemaking Corpus: Identifying Substantive Issues in Public Comments
We describe the creation of a corpus that supports a real-world hierarchical text categorization task in the domain of electronic rulemaking (eRulemaking). Features of the task an...
Claire Cardie, Cynthia Farina, Matt Rawding, Adil ...
LREC
2008
122views Education» more  LREC 2008»
13 years 7 months ago
A Taxonomy of Lexical Metadata Categories
Metadata registries comprising sets of categories to be used in data collections exist in many fields. The purpose of a metadata registry is to facilitate data exchange and intero...
Bodil Nistrup Madsen, Hanne Erdman Thomsen
LREC
2008
110views Education» more  LREC 2008»
13 years 7 months ago
Unsupervised and Domain Independent Ontology Learning: Combining Heterogeneous Sources of Evidence
Acquiring knowledge from the Web to build domain ontologies has become a common practice in the Ontological Engineering field. The vast amount of freely available information allo...
David Manzano-Macho, Asunción Gómez-...
LREC
2008
91views Education» more  LREC 2008»
13 years 7 months ago
The BNC Parsed with RASP4UIMA
We have integrated the RASP system with the UIMA framework (RASP4UIMA) and used this to parse the XML-encoded version of the British National Corpus (BNC). All original annotation...
Øistein E. Andersen, Julien Nioche, Ted Bri...
LREC
2008
225views Education» more  LREC 2008»
13 years 7 months ago
The MoveOn Motorcycle Speech Corpus
A speech and noise corpus dealing with the extreme conditions of the motorcycle environment is developed within the MoveOn project. Speech utterances in British English are record...
Thomas Winkler, Theodoros Kostoulas, Richard Adder...
LREC
2008
158views Education» more  LREC 2008»
13 years 7 months ago
Harvesting Multi-Word Expressions from Parallel Corpora
The paper presents a set of approaches to extend the automatically created Slovene wordnet with nominal multiword expressions. In the first approach multiword expressions from Pri...
Spela Vintar, Darja Fiser