We present a simple and scalable algorithm for clustering tens of millions of phrases and use the resulting clusters as features in discriminative classifiers. To demonstrate the ...
Conventional sentence compression methods employ a syntactic parser to compress a sentence without changing its meaning. However, the reference compressions made by humans do not ...
In this paper, we propose a new Bayesian model for fully unsupervised word segmentation and an efficient blocked Gibbs sampler combined with dynamic programming for inference. Our...
A parsing makes 1-best search efficient by suppressing unlikely 1-best items. Existing kbest extraction methods can efficiently search for top derivations, but only after an exhau...
Empirical studies on corpora involve making measurements of several quantities for the purpose of comparing corpora, creating language models or to make generalizations about spec...