We combine lexical, syntactic, and discourse features to produce a highly predictive model of human readers' judgments of text readability. This is the first study to take in...
The performance of machine translation systems varies greatly depending on the source and target languages involved. Determining the contribution of different characteristics of l...
This paper introduces a new method for identifying named-entity (NE) transliterations in bilingual corpora. Recent works have shown the advantage of discriminative approaches to t...
We present a discriminative, latent variable approach to syntactic parsing in which rules exist at multiple scales of refinement. The model is formally a latent variable CRF gramm...
We propose a new graph-based semisupervised learning (SSL) algorithm and demonstrate its application to document categorization. Each document is represented by a vertex within a ...