In morphologically rich languages such as Arabic, the abundance of word forms resulting from increased morpheme combinations is significantly greater than for languages with fewer...
We present a robust parser which is trained on a treebank of ungrammatical sentences. The treebank is created automatically by modifying Penn treebank sentences so that they conta...
Jennifer Foster, Joachim Wagner, Josef van Genabit...
In this paper we propose a domainindependent text segmentation method, which consists of three components. Latent Dirichlet allocation (LDA) is employed to compute words semantic ...
We present a method for learning bilingual translation lexicons from monolingual corpora. Word types in each language are characterized by purely monolingual features, such as con...
Aria Haghighi, Percy Liang, Taylor Berg-Kirkpatric...
We will demonstrate a novel graphical interface for correcting search errors in the output of a speech recognizer. This interface allows the user to visualize the word lattice by ...