Sciweavers

579 search results - page 19 / 116
» Modeling word burstiness using the Dirichlet distribution
Sort
View
TSD
2004
Springer
15 years 5 months ago
How Dominant Is the Commonest Sense of a Word?
We present a mathematical model of word sense frequency distributions, and use word distributions to set parameters. The model implies that the expected dominance of the commonest ...
Adam Kilgarriff
SIGIR
2004
ACM
15 years 5 months ago
A nonparametric hierarchical bayesian framework for information filtering
Information filtering has made considerable progress in recent years.The predominant approaches are content-based methods and collaborative methods. Researchers have largely conc...
Kai Yu, Volker Tresp, Shipeng Yu
AIRS
2004
Springer
15 years 5 months ago
Automatic Word Clustering for Text Categorization Using Global Information
This paper presents a cluster-based text categorization system which uses class distributional clustering of words. We propose a new clustering model which considers the global in...
Wenliang Chen, Xingzhi Chang, Huizhen Wang, Jingbo...
IDEAS
2006
IEEE
218views Database» more  IDEAS 2006»
15 years 5 months ago
PBIRCH: A Scalable Parallel Clustering algorithm for Incremental Data
We present a parallel version of BIRCH with the objective of enhancing the scalability without compromising on the quality of clustering. The incoming data is distributed in a cyc...
Ashwani Garg, Ashish Mangla, Neelima Gupta, Vasudh...
DEXAW
2010
IEEE
202views Database» more  DEXAW 2010»
15 years 25 days ago
Identifying Sentence-Level Semantic Content Units with Topic Models
Abstract--Statistical approaches to document content modeling typically focus either on broad topics or on discourselevel subtopics of a text. We present an analysis of the perform...
Leonhard Hennig, Thomas Strecker, Sascha Narr, Ern...