In this paper we propose a novel late join algorithm for distributed applications with a fully replicated architecture (e.g. shared whiteboards). The term `late join algorithm'...
A join-index is a data structure used for processing join queries in databases. Join-indices use precomputation techniques to speed up online query processing and are useful for da...
Data cleaning based on similarities involves identification of "close" tuples, where closeness is evaluated using a variety of similarity functions chosen to suit the do...
Similarity joins have been studied as key operations in multiple application domains, e.g., record linkage, data cleaning, multimedia and video applications, and phenomena detectio...
Web data integration is an important preprocessing step for web mining. It is highly likely that several records on the web whose textual representations differ may represent the ...