XML has been known as a document standard in representation and exchange of data on the Internet, and is also used as a standard language for the search and reuse of scattered doc...
Eun-Young Kim, Jin-Ho Choi, Jhung-Soo Hong, Tae-Hu...
An approach to simultaneous document classification and word clustering is developed using a two-way mixture model of Poisson distributions. Each document is represented by a vect...
Keyword search is a proven, user-friendly way to query HTML documents in the World Wide Web. We propose keyword search in XML documents, modeled as labeled trees, and describe cor...
On an abstract level, XML Schema increases the limited expressive power of Document Type Definitions (DTDs) by extending them with a recursive typing mechanism. However, an invest...
Geert Jan Bex, Wim Martens, Frank Neven, Thomas Sc...
Large volume public comment campaigns and web portals that encourage the public to customize form letters produce many near-duplicate documents, which increases processing and sto...