Checking determinism of XML Schema content models in optimal time

12 years 7 months ago
Checking determinism of XML Schema content models in optimal time
We consider the determinism checking of XML Schema content models, as required by the W3C Recommendation. We argue that currently applied solutions have flaws and make processors vulnerable to exponential resource needs by pathological schemas, and we help to eliminate this potential vulnerability of XML Schema based systems. XML Schema content models are essentially regular expressions extended with numeric occurrence indicators. A previously published polynomial-time solution to check the determinism of such expressions is improved to run in linear time, and the improved algorithm is implemented and evaluated experimentally. When compared to the corresponding method of a popular production-quality XML Schema processor, the new implementation runs orders of magnitude faster. Enhancing the solution to take further extensions of XML Schema into account without compromising its linear scalability is also discussed. Key words: Regular expression, numeric occurrence indicator, one-unambi...
Pekka Kilpeläinen
Added 14 May 2011
Updated 14 May 2011
Type Journal
Year 2011
Where IS
Authors Pekka Kilpeläinen
Comments (0)