Accurately and efficiently estimating the number of distinct values for some attribute(s) or sets of attributes in a data set is of critical importance to many database operation...
An increasing amount of heterogeneous information about scientific research is becoming available on-line. This potentially allows users to explore the information from multiple p...
The Google search engine uses a method called PageRank, together with term-based and other ranking techniques, to order search results returned to the user. PageRank uses link ana...
The proliferation of XML as a standard for data representation and exchange in diverse, next-generation Web applications has created an emphatic need for effective XML data-integr...
Wenfei Fan, Minos N. Garofalakis, Ming Xiong, Xibe...
Data Cleaning is an important process that has been at the center of research interest in recent years. An important end goal of effective data cleaning is to identify the relatio...
Sudipto Guha, Nick Koudas, Amit Marathe, Divesh Sr...