Sciweavers

735 search results - page 40 / 147
» Corpora and data preparation
Sort
View
ACL
2004
14 years 11 months ago
Finding Predominant Word Senses in Untagged Text
In word sense disambiguation (WSD), the heuristic of choosing the most common sense is extremely powerful because the distribution of the senses of a word is often skewed. The pro...
Diana McCarthy, Rob Koeling, Julie Weeds, John A. ...
MT
2002
107views more  MT 2002»
14 years 9 months ago
Translation with Scarce Bilingual Resources
Machine translation of human languages is a field almost as old as computers themselves. Recent approaches to this challenging problem aim at learning translation knowledge automat...
Yaser Al-Onaizan, Ulrich Germann, Ulf Hermjakob, K...
ACL
2011
14 years 1 months ago
Rare Word Translation Extraction from Aligned Comparable Documents
We present a first known result of high precision rare word bilingual extraction from comparable corpora, using aligned comparable documents and supervised classification. We in...
Emmanuel Prochasson, Pascale Fung
ACL
2011
14 years 1 months ago
An Algorithm for Unsupervised Transliteration Mining with an Application to Word Alignment
We propose a language-independent method for the automatic extraction of transliteration pairs from parallel corpora. In contrast to previous work, our method uses no form of supe...
Hassan Sajjad, Alexander Fraser, Helmut Schmid
ACL
2008
14 years 11 months ago
Distributed Word Clustering for Large Scale Class-Based Language Modeling in Machine Translation
In statistical language modeling, one technique to reduce the problematic effects of data sparsity is to partition the vocabulary into equivalence classes. In this paper we invest...
Jakob Uszkoreit, Thorsten Brants