Sciweavers

298 search results - page 5 / 60
» An information-theoretic measure for document similarity
Sort
View
71
Voted
ACL
2010
14 years 7 months ago
Evaluating Machine Translations Using mNCD
This paper introduces mNCD, a method for automatic evaluation of machine translations. The measure is based on normalized compression distance (NCD), a general information theoret...
Marcus Dobrinkat, Tero Tapiovaara, Jaakko Väy...
81
Voted
ICDM
2002
IEEE
162views Data Mining» more  ICDM 2002»
15 years 2 months ago
Phrase-based Document Similarity Based on an Index Graph Model
Document clustering techniques mostly rely on single term analysis of the document data set, such as the Vector Space Model. To better capture the structure of documents, the unde...
Khaled M. Hammouda, Mohamed S. Kamel
88
Voted
AUSDM
2008
Springer
230views Data Mining» more  AUSDM 2008»
14 years 11 months ago
Combining Structure and Content Similarities for XML Document Clustering
This paper proposes a clustering approach that explores both the content and the structure of XML documents for determining similarity among them. Assuming that the content and th...
Tien Tran, Richi Nayak, Peter Bruza
ICPR
2002
IEEE
15 years 10 months ago
The Performance Analysis of a Chi-square Similarity Measure for Topic Related Clustering of Noisy Transcripts
The goal of the paper is to present a novel Chi-square similarity measure and assess its performance through comparison with well-known similarity measures such as Cosine, Dice, a...
Oktay Ibrahimov, Ishwar K. Sethi, Nevenka Dimitrov...
AINA
2007
IEEE
15 years 3 months ago
Using Web Directories for Similarity Measurement in Personal Name Disambiguation
In this paper, we target on the problem of personal name disambiguation in search results returned by personal name queries. Usually, a personal name refers to several people. The...
Quang Minh Vu, Tomonari Masada, Atsuhiro Takasu, J...