The detection of duplicate tuples, corresponding to the same real-world entity, is an important task in data integration and cleaning. While many techniques exist to identify such...
Nearest Neighbor search is an important and widely used problem in a number of important application domains. In many of these domains, the dimensionality of the data representati...
Web search engines often federate many user queries to relevant structured databases. For example, a product related query might be federated to a product database containing thei...
Because of the high volume and unpredictable arrival rate, stream processing systems may not always be able to keep up with the input data streams-- resulting in buffer overflow a...
Say you are looking for information about a particular person. A search engine returns many pages for that person's name but which pages are about the person you care about, ...