Sciweavers

298 search results - page 19 / 60
» An information-theoretic measure for document similarity
Sort
View
110
Voted
ECIR
2008
Springer
15 years 3 months ago
A Wikipedia-Based Multilingual Retrieval Model
This paper introduces CL-ESA, a new multilingual retrieval model for the analysis of cross-language similarity. The retrieval model exploits the multilingual alignment of Wikipedia...
Martin Potthast, Benno Stein, Maik Anderka
ICTIR
2009
Springer
15 years 8 months ago
A New Measure of the Cluster Hypothesis
Abstract. We have found that the nearest neighbor (NN) test is an insufficient measure of the cluster hypothesis. The NN test is a local measure of the cluster hypothesis. Designer...
Mark D. Smucker, James Allan
WWW
2008
ACM
15 years 1 months ago
A Novelty-based Clustering Method for On-line Documents
In this paper, we describe a document clustering method called noveltybased document clustering. This method clusters documents based on similarity and novelty. The method assigns...
Sophoin Khy, Yoshiharu Ishikawa, Hiroyuki Kitagawa
115
Voted
SIGIR
1999
ACM
15 years 6 months ago
Using a Belief Revision Operator for Document Ranking in Extended Boolean Models
This paper claims that Belief Revision can be seen as a theoretical framework for document ranking in Extended Boolean Models. For a model of Information Retrieval based on propos...
David E. Losada, Alvaro Barreiro
ECIR
2008
Springer
15 years 3 months ago
Clustering Template Based Web Documents
More and more documents on the World Wide Web are based on templates. On a technical level this causes those documents to have a quite similar source code and DOM tree structure. G...
Thomas Gottron