Fast Answering of XPath Query Workloads on Web Collections

9 years 5 months ago
Fast Answering of XPath Query Workloads on Web Collections
Several web applications (such as processing RSS feeds or web service messages) rely on XPath-based data manipulation tools. Web developers need to use XPath queries effectively on increasingly larger web collections containing hundreds of thousands of XML documents. Even when tasks only need to deal with a single document at a time, developers benefit from understanding the behaviour of XPath expressions across multiple documents (e.g., what will a query return when run over the thousands of hourly feeds collected during the last few months?). Dealing with the (highly variable) structure of such web collections poses additional challenges. This paper introduces DescribeX, a powerful framework that is capable of describing arbitrarily complex XML summaries of web collections, enabling the efficient evaluation of XPath workloads (supporting all the axes and language constructs in XPath). Experiments validate that DescribeX enables existing document-at-a-time XPath tools to scale up to...
Mariano P. Consens, Flavio Rizzolo
Added 09 Jun 2010
Updated 09 Jun 2010
Type Conference
Year 2007
Where XSYM
Authors Mariano P. Consens, Flavio Rizzolo
Comments (0)