We describe a open-domain information extraction method for extracting concept-instance pairs from an HTML corpus. Most earlier approaches to this problem rely on combining cluste...
Bhavana Bharat Dalvi, William W. Cohen, Jamie Call...
We propose a method of acquiring attribute words for a wide range of objects from Japanese Web documents. The method is a simple unsupervised method that utilizes the statistics of...
The Semantic Web seems to be evolving into a property-linked web of RDF data, conceptually divorced from (but physically housed in) the hyperlinked web of HTML documents. We discus...
The World Wide Web revolutionized the use of forms in everyday private and business life by allowing a move away from paper forms to easily accessible digital forms. Data captured...
Stijn Dekeyser, Jan Hidders, Richard Watson, Ron A...
Information extraction (IE) aims at extracting specific information from a collection of documents. A lot of previous work on 10 from semi-structured documents (in XML or HTML) us...
Raymond Kosala, Maurice Bruynooghe, Jan Van den Bu...