Sciweavers

244 search results - page 8 / 49
» From HTML documents to web tables and rules
Sort
View
128
Voted
WSDM
2012
ACM
252views Data Mining» more  WSDM 2012»
13 years 7 months ago
WebSets: extracting sets of entities from the web using unsupervised information extraction
We describe a open-domain information extraction method for extracting concept-instance pairs from an HTML corpus. Most earlier approaches to this problem rely on combining cluste...
Bhavana Bharat Dalvi, William W. Cohen, Jamie Call...
IJCNLP
2005
Springer
15 years 5 months ago
Automatic Discovery of Attribute Words from Web Documents
We propose a method of acquiring attribute words for a wide range of objects from Japanese Web documents. The method is a simple unsupervised method that utilizes the statistics of...
Kosuke Tokunaga, Jun'ichi Kazama, Kentaro Torisawa
ESWS
2007
Springer
15 years 6 months ago
A Unified Approach to Retrieving Web Documents and Semantic Web Data
The Semantic Web seems to be evolving into a property-linked web of RDF data, conceptually divorced from (but physically housed in) the hyperlinked web of HTML documents. We discus...
Trivikram Immaneni, Krishnaprasad Thirunarayan
ADC
2006
Springer
139views Database» more  ADC 2006»
15 years 5 months ago
Peer-to-peer form based web information systems
The World Wide Web revolutionized the use of forms in everyday private and business life by allowing a move away from paper forms to easily accessible digital forms. Data captured...
Stijn Dekeyser, Jan Hidders, Richard Watson, Ron A...
IJCAI
2003
15 years 1 months ago
Information Extraction from Web Documents Based on Local Unranked Tree Automaton Inference
Information extraction (IE) aims at extracting specific information from a collection of documents. A lot of previous work on 10 from semi-structured documents (in XML or HTML) us...
Raymond Kosala, Maurice Bruynooghe, Jan Van den Bu...