A primary challenge to large-scale data integration is creating semantic equivalences between elements from different data sources that correspond to the same real-world entity or...
Shawn R. Jeffery, Michael J. Franklin, Alon Y. Hal...
Poor quality data is prevalent in databases due to a variety of reasons, including transcription errors, lack of standards for recording database fields, etc. To be able to query ...
Byung-Won On, Nick Koudas, Dongwon Lee, Divesh Sri...
Geographic Information Systems (GIS) are increasingly managing very large sets of data and hence a centralized data repository may not always provide the most scalable solution. H...
In this paper we introduce a novel architecture for data processing, based on a functional fusion between a data and a computation layer. We show how such an architecture can be le...
Radu Sion, Ramesh Natarajan, Inderpal Narang, Wen-...
Given its importance, the problem of predicting rare classes in large-scale multi-labeled data sets has attracted great attentions in the literature. However, the rare-class probl...