Sciweavers

ICDE
2012
IEEE
252views Database» more  ICDE 2012»
11 years 6 months ago
Fuzzy Joins Using MapReduce
—Fuzzy/similarity joins have been widely studied in the research community and extensively used in real-world applications. This paper proposes and evaluates several algorithms f...
Foto N. Afrati, Anish Das Sarma, David Menestrina,...
APWEB
2010
Springer
13 years 7 months ago
An Incremental Prefix Filtering Approach for the All Pairs Similarity Search Problem
Given a set of records, a threshold value t and a similarity function, we investigate the problem of finding all pairs of records such that similarity between each pair is above t....
Hoang Thanh Lam, Dinh Viet Dung, Raffaele Perego, ...
GCB
2004
Springer
223views Biometrics» more  GCB 2004»
13 years 7 months ago
PoSSuMsearch: Fast and Sensitive Matching of Position Specific Scoring Matrices using Enhanced Suffix Arrays
: In biological sequence analysis, position specific scoring matrices (PSSMs) are widely used to represent sequence motifs. In this paper, we present a new nonheuristic algorithm, ...
Michael Beckstette, Dirk Strothmann, Robert Homann...
WCRE
2008
IEEE
13 years 10 months ago
PREREQIR: Recovering Pre-Requirements via Cluster Analysis
High-level software artifacts, such as requirements, domain-specific requirements, and so on, are an important source of information that is often neglected during the reverse- an...
Jane Huffman Hayes, Giuliano Antoniol, Yann-Ga&eum...
GIS
2007
ACM
14 years 4 months ago
TS2-tree - an efficient similarity based organization for trajectory data
The increasingly popular GPS technology and the growing amount of trajectory data it generates create the need for developing applications that efficiently store and query traject...
Petko Bakalov, Eamonn J. Keogh, Vassilis J. Tsotra...
ICDE
2009
IEEE
194views Database» more  ICDE 2009»
14 years 5 months ago
Top-k Set Similarity Joins
Abstract-- Similarity join is a useful primitive operation underlying many applications, such as near duplicate Web page detection, data integration, and pattern recognition. Tradi...
Chuan Xiao, Wei Wang 0011, Xuemin Lin, Haichuan Sh...