Duplicate detection is the process of identifying multiple representations of a same real-world object in a data source. Duplicate detection is a problem of critical importance in...
Melanie Weis, Felix Naumann, Ulrich Jehle, Jens Lu...
Obtaining fast and good quality approximations to data distributions is a problem of central interest to database management. A variety of popular database applications including,...
In deletion propagation, tuples from the database are deleted in order to reflect the deletion of a tuple from the view. Such an operation may result in the (often necessary) del...
Central to a data cleaning system are record matching and data repairing. Matching aims to identify tuples that refer to the same real-world object, and repairing is to make a dat...
Wenfei Fan, Jianzhong Li, Shuai Ma, Nan Tang, Weny...
Software evolution research inherently has several resourceintensive logistical constraints. Archived project artifacts, such as those found in source code repositories and bug tr...
Jennifer Bevan, E. James Whitehead Jr., Sunghun Ki...