A Web repository is a large special-purpose collection of Web pages and associated indexes. Many useful queries and computations over such repositories involve traversal and navig...
Extractors and taggers turn unstructured text into entityrelation (ER) graphs where nodes are entities (email, paper, person, conference, company) and edges are relations (wrote, ...
XKeyword provides efficient keyword proximity queries on large XML graph databases. A query is simply a list of keywords and does not require any schema or query language knowledg...
The All Nearest Neighbor (ANN) operation is a commonly used primitive for analyzing large multi-dimensional datasets. Since computing ANN is very expensive, in previous works R*-t...
Inverted indexes using sequences of characters (n-grams) as terms provide an error-resilient and language-independent way to query for arbitrary substrings and perform approximate...