Abstract. Solving conflicts between overlapping databases requires an understanding of the reasons that lead to the inconsistencies. Provided that conflicts do not occur randomly b...
The main challenge in integrating two hierarchies is determining the correspondence between the nodes and edges of each hierarchy. Traditionally, the correspondence is determined ...
Duplicate elimination is an important stage in integrating data from multiple sources. The challenges involved are finding a robust deduplication function that can identify when t...
Recently TRW fielded a prototype system for a government customer. It provides a wide range of capabilities including data collection, hierarchical storage, automated distribution...
Researchers in the data mining area frequently have to spend significant portion of their time on preprocessing the data in order to apply their algorithms to real-world datasets...
Zhaoqi Chen, Dmitri V. Kalashnikov, Sharad Mehrotr...