Sciweavers

735 search results - page 15 / 147
» Corpora and data preparation
Sort
View
LREC
2008
60views Education» more  LREC 2008»
14 years 11 months ago
Spock - a Spoken Corpus Client
Spock is an open source tool for the easy deployment of time-aligned corpora. It is fully web-based, and has very limited server-side requirements. It allows the end-user to searc...
Maarten Janssen, Tiago Freitas
NAACL
1994
14 years 11 months ago
Multilingual Speech Databases at LDC
As multilingual products and technology grow in importance, the Linguistic Data Consortium (LDC) intends to provide the resources needed for research and development activities, e...
John J. Godfrey
ECIR
2010
Springer
14 years 11 months ago
Extracting Multilingual Topics from Unaligned Comparable Corpora
Topic models have been studied extensively in the context of monolingual corpora. Though there are some attempts to mine topical structure from cross-lingual corpora, they require ...
Jagadeesh Jagarlamudi, Hal Daumé III
LREC
2010
166views Education» more  LREC 2010»
14 years 11 months ago
Corpora for Automatically Learning to Map Natural Language Questions into SQL Queries
Automatically translating natural language into machine-readable instructions is one of major interesting and challenging tasks in Natural Language (NL) Processing. This problem c...
Alessandra Giordani, Alessandro Moschitti
ACL
2008
14 years 11 months ago
Unsupervised Translation Induction for Chinese Abbreviations using Monolingual Corpora
Chinese abbreviations are widely used in modern Chinese texts. Compared with English abbreviations (which are mostly acronyms and truncations), the formation of Chinese abbreviati...
Zhifei Li, David Yarowsky