The use of any modern computer system leaves unintended traces of expired data and remnants of users' past activities. In this paper, we investigate the unintended persistenc...
Patrick Stahlberg, Gerome Miklau, Brian Neil Levin...
Effective data placement strategies can enhance the performance of data-intensive applications implemented on high end computing clusters. Such strategies can have a significant i...
Over recent years, "Internet-able" applications have been used to support domains where distributed functionality is essential. This flexibility is also pertinent in situ...
Duplicate detection is the process of identifying multiple representations of a same real-world object in a data source. Duplicate detection is a problem of critical importance in...
Melanie Weis, Felix Naumann, Ulrich Jehle, Jens Lu...
ct One significant effort towards combining the virtues of Web search, viz. being accessible to untrained users and able to cope with vastly heterogeneous data, with those of dat...