Sciweavers

EMNLP
2007
13 years 6 months ago
Learning to Merge Word Senses
It has been widely observed that different NLP applications require different sense granularities in order to best exploit word sense distinctions, and that for many applications ...
Rion Snow, Sushant Prakash, Daniel Jurafsky, Andre...
EMNLP
2007
13 years 6 months ago
Learning Structured Models for Phone Recognition
We present a maximally streamlined approach to learning HMM-based acoustic models for automatic speech recognition. In our approach, an initial monophone HMM is iteratively refin...
Slav Petrov, Adam Pauls, Dan Klein
EMNLP
2007
13 years 6 months ago
A Topic Model for Word Sense Disambiguation
We develop latent Dirichlet allocation with WORDNET (LDAWN), an unsupervised probabilistic topic model that includes word sense as a hidden variable. We develop a probabilistic po...
Jordan L. Boyd-Graber, David M. Blei, Xiaojin Zhu
EMNLP
2007
13 years 6 months ago
Adapting the RASP System for the CoNLL07 Domain-Adaptation Task
We describe our submission to the domain adaptation track of the CoNLL07 shared task in the open class for systems using external resources. Our main finding was that it was very...
Rebecca Watson, Ted Briscoe
EMNLP
2007
13 years 6 months ago
Improving Query Spelling Correction Using Web Search Results
Traditional research on spelling correction in natural language processing and information retrieval literature mostly relies on pre-defined lexicons to detect spelling errors. Bu...
Qing Chen, Mu Li, Ming Zhou
EMNLP
2007
13 years 6 months ago
Factored Translation Models
We present an extension of phrase-based statistical machine translation models that enables the straight-forward integration of additional annotation at the word-level — may it ...
Philipp Koehn, Hieu Hoang
EMNLP
2007
13 years 6 months ago
Bayesian Document Generative Model with Explicit Multiple Topics
In this paper, we proposed a novel probabilistic generative model to deal with explicit multiple-topic documents: Parametric Dirichlet Mixture Model(PDMM). PDMM is an expansion of...
Issei Sato, Hiroshi Nakagawa
EMNLP
2007
13 years 6 months ago
Unsupervised Part-of-Speech Acquisition for Resource-Scarce Languages
This paper proposes a new bootstrapping approach to unsupervised part-of-speech induction. In comparison to previous bootstrapping algorithms developed for this problem, our appro...
Sajib Dasgupta, Vincent Ng
EMNLP
2007
13 years 6 months ago
A Systematic Comparison of Training Criteria for Statistical Machine Translation
We address the problem of training the free parameters of a statistical machine translation system. We show significant improvements over a state-of-the-art minimum error rate tr...
Richard Zens, Sasa Hasan, Hermann Ney
EMNLP
2007
13 years 6 months ago
Improving Statistical Machine Translation Using Word Sense Disambiguation
We show for the first time that incorporating the predictions of a word sense disambiguation system within a typical phrase-based statistical machine translation (SMT) model cons...
Marine Carpuat, Dekai Wu