Indexes for large collections are often divided into shards that are distributed across multiple computers and searched in parallel to provide rapid interactive search. Typically,...
In developing High-Performance Computing (HPC) software, time to solution is an important metric. This metric is comprised of two main components: the human effort required develo...
With the proliferation of the RDF data format, engines for RDF query processing are faced with very large graphs that contain hundreds of millions of RDF triples. This paper addre...
Finding all the occurrences of a twig pattern in an XML database is a core operation for efficient evaluation of XML queries. A number of algorithms have been proposed to process ...
In this paper we focus on the following problem in information management: given a large collection of recorded information and some knowledge of the process that is generating th...