Scalable join processing on very large RDF graphs

12 years 8 months ago
Scalable join processing on very large RDF graphs
With the proliferation of the RDF data format, engines for RDF query processing are faced with very large graphs that contain hundreds of millions of RDF triples. This paper addresses the resulting scalability problems. Recent prior work along these lines has focused on indexing and other physical-design issues. The current paper focuses on join processing, as the fine-grained and schema-relaxed use of RDF often entails star- and chain-shaped join queries with many input streams from index scans. We present two contributions for scalable join processing. First, we develop very light-weight methods for sideways information passing between separate joins at query run-time, to provide highly effective filters on the input streams of joins. Second, we improve previously proposed algorithms for join-order optimization by more accurate selectivity estimations for very large RDF graphs. Experimental studies with several RDF datasets, including the UniProt collection, demonstrate the performa...
Thomas Neumann, Gerhard Weikum
Added 05 Dec 2009
Updated 05 Dec 2009
Type Conference
Year 2009
Authors Thomas Neumann, Gerhard Weikum
Comments (0)