With large amounts of correlated probabilistic data being generated in a wide range of application domains including sensor networks, information extraction, event detection etc.,...
A primary challenge to large-scale data integration is creating semantic equivalences between elements from different data sources that correspond to the same real-world entity or...
Shawn R. Jeffery, Michael J. Franklin, Alon Y. Hal...
As peer-to-peer (P2P) networks become more familiar to the database community, intense interest has built up in using their scalability and resilience properties to scale database...
Approximate queries on a collection of strings are important in many applications such as record linkage, spell checking, and Web search, where inconsistencies and errors exist in...
How can we efficiently find a clustering, i.e. a concise description of the cluster structure, of a given data set which contains an unknown number of clusters of different shape ...