Sciweavers

579 search results - page 15 / 116
» Modeling word burstiness using the Dirichlet distribution
Sort
View
99
Voted
ACL
2003
15 years 1 months ago
Unsupervised Segmentation of Words Using Prior Distributions of Morph Length and Frequency
We present a language-independent and unsupervised algorithm for the segmentation of words into morphs. The algorithm is based on a new generative probabilistic model, which makes...
Mathias Creutz
ECIR
2009
Springer
15 years 9 months ago
A Topic-Based Measure of Resource Description Quality for Distributed Information Retrieval
The aim of query-based sampling is to obtain a sufficient, representative sample of an underlying (text) collection. Current measures for assessing sample quality are too coarse gr...
Mark Baillie, Mark James Carman, Fabio Crestani
NLE
2010
218views more  NLE 2010»
14 years 10 months ago
Automatic discovery of word semantic relations using paraphrase alignment and distributional lexical semantics analysis
Thesauri, that list the most salient semantic relations between words have mostly been compiled manually. Therefore, the inclusion of an entry depends on the subjective decision o...
Gaël Dias, Rumen Moraliyski, João Cord...
LREC
2008
93views Education» more  LREC 2008»
15 years 1 months ago
Using a Probabilistic Model of Context to Detect Word Obfuscation
This paper proposes a distributional model of word use and word meaning which is derived purely from a body of text, and then applies this model to determine whether certain words...
Sanaz Jabbari, Ben Allison, Louise Guthrie
ACL
2008
15 years 1 months ago
Distributed Word Clustering for Large Scale Class-Based Language Modeling in Machine Translation
In statistical language modeling, one technique to reduce the problematic effects of data sparsity is to partition the vocabulary into equivalence classes. In this paper we invest...
Jakob Uszkoreit, Thorsten Brants