The number of documents published via WWW in form of SGML/HTML has been rapidly growing for years. Efficient, declarative access mechanisms for this type of documents
Text documents often embed data that is structured in nature. This structured data is increasingly exposed using information extraction systems, which generate structured relation...
Twig pattern matching (TPM) is the core operation of XML query processing. Existing approaches rely on either efficient data structures or novel labeling/indexing schemes to reduce...
Text search engines return a set of k documents ranked by similarity to a query. Typically, documents and queries are drawn from natural language text, which can readily be partiti...
J. Shane Culpepper, Gonzalo Navarro, Simon J. Pugl...
Text documents often embed data that is structured in nature, and we can expose this structured data using information extraction technology. By processing a text database with inf...