Sciweavers

86 search results - page 7 / 18
» Measuring similarity of semi-structured documents with conte...
Sort
View
82
Voted
ICTIR
2009
Springer
15 years 4 months ago
Robust Word Similarity Estimation Using Perturbation Kernels
We introduce perturbation kernels, a new class of similarity measure for information retrieval that casts word similarity in terms of multi-task learning. Perturbation kernels mode...
Kevyn Collins-Thompson
WWW
2008
ACM
14 years 9 months ago
A Novelty-based Clustering Method for On-line Documents
In this paper, we describe a document clustering method called noveltybased document clustering. This method clusters documents based on similarity and novelty. The method assigns...
Sophoin Khy, Yoshiharu Ishikawa, Hiroyuki Kitagawa
IDEAS
2009
IEEE
192views Database» more  IDEAS 2009»
15 years 4 months ago
A cluster-based approach to XML similarity joins
A natural consequence of the widespread adoption of XML as standard for information representation and exchange is the redundant storage of large amounts of persistent XML documen...
Leonardo Ribeiro, Theo Härder, Fernanda S. Pi...
WSDM
2010
ACM
261views Data Mining» more  WSDM 2010»
15 years 6 months ago
Learning Similarity Metrics for Event Identification in Social Media
Social media sites (e.g., Flickr, YouTube, and Facebook) are a popular distribution outlet for users looking to share their experiences and interests on the Web. These sites host ...
Hila Becker, Mor Naaman, Luis Gravano
ICDE
2008
IEEE
147views Database» more  ICDE 2008»
15 years 11 months ago
Fast Indexes and Algorithms for Set Similarity Selection Queries
Data collections often have inconsistencies that arise due to a variety of reasons, and it is desirable to be able to identify and resolve them efficiently. Set similarity queries ...
Marios Hadjieleftheriou, Amit Chandel, Nick Koudas...