Most databases contain “name constants” like course numbers, personal names, and place names that correspond to entities in the real world. Previous work in integration of het...
We present new algorithms for computing approximate quantiles of large datasets in a single pass. The approximation guarantees are explicit, and apply without regard to the value ...
Gurmeet Singh Manku, Sridhar Rajagopalan, Bruce G....
Data mining applications place special requirements on clustering algorithms including: the ability to nd clusters embedded in subspaces of high dimensional data, scalability, end...
Rakesh Agrawal, Johannes Gehrke, Dimitrios Gunopul...
Two new spatial join operations, distance join and distance semijoin, are introduced where the join output is ordered by the distance betweenthe spatialattribute values ofthe join...
The Web is based on a browsing paradigm that makes it di cult to retrieve and integrate data from multiple sites. Today, the only way to achieve this integration is by building sp...