This paper studies the problem of extracting data from a Web page that contains several structured data records. The objective is to segment these data records, extract data items...
Much has been documented in the literature on sentiment analysis and document summarisation. Much of this applies to long structured text in the form of documents and blog posts. W...
William Simm, Maria Angela Ferrario, Scott Songlin...
This paper studies structured data extraction from Web pages, e.g., online product description pages. Existing approaches to data extraction include wrapper induction and automatic...
Background: Clustering the information content of large high-dimensional gene expression datasets has widespread application in "omics" biology. Unfortunately, the under...
The XML Wrapper is a new feature of the federated database capabilities of DB2/UDB v8. It enables users and applications to issue SQL queries against XML data from a variety of so...