Most research on nearest neighbor algorithms in the literature has been focused on the Euclidean case. In many practical search problems however, the underlying metric is non-Eucl...
Schema matching is a basic problem in many database application domains, such as data integration, Ebusiness, data warehousing, and semantic query processing. In current implementa...
Our aim is to develop new database technologies for the approximate matching of unstructured string data using indexes. We explore the potential of the suffix tree data structure i...
Estimating the cardinality (i.e. number of distinct elements) of an arbitrary set expression defined over multiple distributed streams is one of the most fundamental queries of in...
In this paper we argue that developing information extraction (IE) programs using Datalog with embedded procedural extraction predicates is a good way to proceed. First, compared ...
Warren Shen, AnHai Doan, Jeffrey F. Naughton, Ragh...