Sciweavers

298 search results - page 19 / 60
» An information-theoretic measure for document similarity
Sort
View
ECIR
2008
Springer
15 years 1 months ago
A Wikipedia-Based Multilingual Retrieval Model
This paper introduces CL-ESA, a new multilingual retrieval model for the analysis of cross-language similarity. The retrieval model exploits the multilingual alignment of Wikipedia...
Martin Potthast, Benno Stein, Maik Anderka
ICTIR
2009
Springer
15 years 6 months ago
A New Measure of the Cluster Hypothesis
Abstract. We have found that the nearest neighbor (NN) test is an insufficient measure of the cluster hypothesis. The NN test is a local measure of the cluster hypothesis. Designer...
Mark D. Smucker, James Allan
WWW
2008
ACM
14 years 11 months ago
A Novelty-based Clustering Method for On-line Documents
In this paper, we describe a document clustering method called noveltybased document clustering. This method clusters documents based on similarity and novelty. The method assigns...
Sophoin Khy, Yoshiharu Ishikawa, Hiroyuki Kitagawa
SIGIR
1999
ACM
15 years 4 months ago
Using a Belief Revision Operator for Document Ranking in Extended Boolean Models
This paper claims that Belief Revision can be seen as a theoretical framework for document ranking in Extended Boolean Models. For a model of Information Retrieval based on propos...
David E. Losada, Alvaro Barreiro
ECIR
2008
Springer
15 years 1 months ago
Clustering Template Based Web Documents
More and more documents on the World Wide Web are based on templates. On a technical level this causes those documents to have a quite similar source code and DOM tree structure. G...
Thomas Gottron