There has been recent interest in the problem of decoding letter substitution ciphers using techniques inspired by natural language processing. We consider a different type of cla...
We describe Akamon, an open source toolkit for tree and forest-based statistical machine translation (Liu et al., 2006; Mi et al., 2008; Mi and Huang, 2008). Akamon implements all...
With a few exceptions, discriminative training in statistical machine translation (SMT) has been content with tuning weights for large feature sets on small development data. Evid...
We propose a simple generative, syntactic language model that conditions on overlapping windows of tree context (or treelets) in the same way that n-gram language models condition...
In this paper, we propose a web-based bilingual concordancer, DOMCAT 1 , for domain-specific computer assisted translation. Given a multi-word expression as a query, the system in...
The problem addressed in this paper is to segment a given multilingual document into segments for each language and then identify the language of each segment. The problem was mot...
In this paper, we present a structural learning model for joint sentiment classification and aspect analysis of text at various levels of granularity. Our model aims to identify ...
In this paper, we propose a computational approach to generate neologisms consisting of homophonic puns and metaphors based on the category of the service to be named and the prop...
The language MIX consists of all strings over the three-letter alphabet {a, b, c} that contain an equal number of occurrences of each letter. We prove Joshi’s (1985) conjecture ...
This paper presents the problem within Hittite and Ancient Near Eastern studies of fragmented and damaged cuneiform texts, and proposes to use well-known text classification metr...