Sciweavers

CASCON
2008

High performance XML parsing using parallel bit stream technology

13 years 5 months ago
High performance XML parsing using parallel bit stream technology
Parabix (parallel bit streams for XML) is an open-source XML parser that employs the SIMD (single-instruction multiple-data) capabilities of modern-day commodity processors to deliver dramatic performance improvements over traditional byte-at-a-time parsing technology. Byte-oriented character data is first transformed to a set of 8 parallel bit streams, each stream comprising one bit per character code unit. Character validation, transcoding and lexical item stream formation are all then carried out in parallel using bitwise logic and shifting operations. Byte-at-a-time scanning loops in the parser are replaced by bit scan loops that can advance by as many as 64 positions with a single instruction. A performance study comparing Parabix with the open-source Expat and Xerces parsers is carried out using the PAPI toolkit. Total CPU cycle counts, level 2 data cache misses and branch mispredictions are measured and compared for each parser. The performance of Parabix is further studied wit...
Robert D. Cameron, Kenneth S. Herdy, Dan Lin
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Where CASCON
Authors Robert D. Cameron, Kenneth S. Herdy, Dan Lin
Comments (0)