There exist many interrelated information sources on the Internet that can be categorized into structured (database) and semistructured (documents). A key challenge is to integrat...
As XML has evolved from a document markup language to a widely-used format for exchange of structured and semistructured data, managing large amounts of XML data has become increa...
Michael Rys, Donald D. Chamberlin, Daniela Floresc...
In this paper we present a system for automatically integrating unstructured text into a multi-relational database using state-of-the-art statistical models for structure extracti...