Sciweavers

63 search results - page 3 / 13
» Large Linguistically-Processed Web Corpora for Multiple Lang...
Sort
View
KCAP
2005
ACM
13 years 11 months ago
Collecting paraphrase corpora from volunteer contributors
Extensive and deep paraphrase corpora are important for a variety of natural language processing and user interaction tasks. In this paper, we present an approach which i) collect...
Timothy Chklovski
ACL
2004
13 years 7 months ago
Multi-Engine Machine Translation with Voted Language Model
The paper describes a particular approach to multiengine machine translation (MEMT), where we make use of voted language models to selectively combine translation outputs from mul...
Tadashi Nomoto
LREC
2008
113views Education» more  LREC 2008»
13 years 7 months ago
Subdomain Sensitive Statistical Parsing using Raw Corpora
Modern statistical parsers are trained on large annotated corpora (treebanks). These treebanks usually consist of sentences addressing different subdomains (e.g. sports, politics,...
Barbara Plank, Khalil Sima'an
EMNLP
2009
13 years 3 months ago
Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora
A significant portion of the world's text is tagged by readers on social bookmarking websites. Credit attribution is an inherent problem in these corpora because most pages h...
Daniel Ramage, David Hall, Ramesh Nallapati, Chris...
AIME
2003
Springer
13 years 11 months ago
Learning Derived Words from Medical Corpora
Abstract. Morphological knowledge (inflection, derivation, compounds) is useful for medical language processing. Some is available for medical English in the UMLS Specialist Lexic...
Pierre Zweigenbaum, Natalia Grabar