Data quality is critical for many information-intensive applications. One of the best opportunities to improve data quality is during entry. USHER provides a theoretical, data-dri...
Kuang Chen, Joseph M. Hellerstein, Tapan S. Parikh
Most existing clustering algorithms cluster highly related data objects such as Web pages and Web users separately. The interrelation among different types of data objects is eith...
Matching records that refer to the same entity across databases is becoming an increasingly important part of many data mining projects, as often data from multiple sources needs ...
Developing countries face significant challenges in network access, making even simple network tasks unpleasant. Many standard techniques—caching and predictive prefetching— ...
Abstract. With the amount of available information on the Web growing rapidly with each day, the need to automatically filter the information in order to ensure greater user effici...
Miha Grcar, Dunja Mladenic, Blaz Fortuna, Marko Gr...