Sciweavers

440 search results - page 3 / 88
» Information Space Based on HTML Structure
Sort
View
IJCAI
2003
13 years 7 months ago
Expressive Power of Tree and String Based Wrappers
There exist two types of wrappers: the string based wrapper such as the LR wrapper, and the tree based wrapper. A tree based wrapper designates extraction regions by nodes on the ...
Daisuke Ikeda, Yasuhiro Yamada, Sachio Hirokawa
SIGIR
2002
ACM
13 years 5 months ago
Predicting category accesses for a user in a structured information space
In a categorized information space, predicting users' information needs at the category level can facilitate personalization, caching and other topic-oriented services. This ...
Mao Chen, Andrea S. LaPaugh, Jaswinder Pal Singh
SIGMOD
2009
ACM
140views Database» more  SIGMOD 2009»
14 years 28 days ago
Robust web extraction: an approach based on a probabilistic tree-edit model
On script-generated web sites, many documents share common HTML tree structure, allowing wrappers to effectively extract information of interest. Of course, the scripts and thus ...
Nilesh N. Dalvi, Philip Bohannon, Fei Sha
DKE
2006
139views more  DKE 2006»
13 years 6 months ago
Information extraction from structured documents using k-testable tree automaton inference
Information extraction (IE) addresses the problem of extracting specific information from a collection of documents. Much of the previous work on IE from structured documents, suc...
Raymond Kosala, Hendrik Blockeel, Maurice Bruynoog...
IJCAI
1997
13 years 7 months ago
Toward Structured Retrieval in Semi-structured Information Spaces
A semi-structured information space consists of multiple collections of textual documents containing fielded or tagged sections. The space can be highly heterogeneous, because eac...
Scott B. Huffman, Catherine Baudin