Sciweavers

27 search results - page 3 / 6
» Probabilistic string similarity joins
Sort
View
CLEF
2010
Springer
13 years 6 months ago
Fuzzy Semantic-Based String Similarity for Extrinsic Plagiarism Detection - Lab Report for PAN at CLEF 2010
This report explains our plagiarism detection method using fuzzy semantic-based string similarity approach. The algorithm was developed through four main stages. First is pre-proce...
Salha Alzahrani, Naomie Salim
COLING
2010
13 years 9 days ago
Simple and Efficient Algorithm for Approximate Dictionary Matching
This paper presents a simple and efficient algorithm for approximate dictionary matching designed for similarity measures such as cosine, Dice, Jaccard, and overlap coefficients. ...
Naoaki Okazaki, Jun-ichi Tsujii
WWW
2003
ACM
14 years 6 months ago
Text joins in an RDBMS for web data integration
The integration of data produced and collected across autonomous, heterogeneous web services is an increasingly important and challenging problem. Due to the lack of global identi...
Luis Gravano, Panagiotis G. Ipeirotis, Nick Koudas...
SDM
2009
SIAM
172views Data Mining» more  SDM 2009»
14 years 2 months ago
Travel-Time Prediction Using Gaussian Process Regression: A Trajectory-Based Approach.
This paper is concerned with the task of travel-time prediction for an arbitrary origin-destination pair on a map. Unlike most of the existing studies, which focus only on a parti...
Sei Kato, Tsuyoshi Idé
SIGMOD
2010
ACM
174views Database» more  SIGMOD 2010»
13 years 10 months ago
Sampling dirty data for matching attributes
We investigate the problem of creating and analyzing samples of relational databases to find relationships between string-valued attributes. Our focus is on identifying attribute...
Henning Köhler, Xiaofang Zhou, Shazia Wasim S...