Compounded words are a challenge for NLP applications such as machine translation (MT). We introduce methods to learn splitting rules from monolingual and parallel corpora. We eva...
This paper describes the enhancements made, within a unification framework, based on typed feature structures, in order to support linking of lexical entries to their translation ...
Alicia Ageno, Francesc Ribas, German Rigau, Horaci...
The class of Linear Inversion Transduction Grammars (LITGs) is introduced, and used to induce a word alignment over a parallel corpus. We show that alignment via Stochastic Bracke...
We propose a simple generative, syntactic language model that conditions on overlapping windows of tree context (or treelets) in the same way that n-gram language models condition...
Abstract. We introduce XMLVM, a Turing complete XML-based programming language based on a stack-based, virtual machine. We show how XMLVM can automatically be created from Java cla...