We propose an algorithm for extracting fields from HTML search results. The output of the algorithm is a database table– a data structure that better lends itself to high-level...
Information Extraction (IE) from text /web documents has become an important application area of AI. As the number of web sites and documents has grown dramatically, the users need...
A representation of the World Wide Web as a directed graph, with vertices representing web pages and edges representing hypertext links, underpins the algorithms used by web search...
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
This paper describes a system combining model-based and learning-based methods for automated reasoning in large theories, i.e. on a large number of problems that use many axioms, l...