Sciweavers

563 search results - page 68 / 113
» Crawling the web for structured documents
Sort
View
DASFAA
2007
IEEE
138views Database» more  DASFAA 2007»
15 years 8 months ago
An Original Semantics to Keyword Queries for XML Using Structural Patterns
XML is by now the de facto standard for exporting and exchanging data on the web. The need for querying XML data sources whose structure is not fully known to the user and the need...
Dimitri Theodoratos, Xiaoying Wu
INFOCOM
2006
IEEE
15 years 8 months ago
Performance of Full Text Search in Structured and Unstructured Peer-to-Peer Systems
— While structured P2P systems (such as DHTs) are often regarded as an improvement over unstructured P2P systems (such as super-peer networks) in terms of routing efficiency, it...
Yong Yang, Rocky Dunlap, Mike Rexroad, Brian F. Co...
DAS
2010
Springer
15 years 5 months ago
Analysis and taxonomy of column header categories for web tables
We describe a component of a document analysis system for constructing ontologies for domain-specific web tables imported into Excel. This component automates extraction of the Wa...
Sharad C. Seth, Ramana Chakradhar Jandhyala, Mukka...
113
Voted
CIKM
2005
Springer
15 years 7 months ago
Web-centric language models
We investigates language models for informational and navigational web search. Retrieval on the web is a task that differs substantially from ordinary ad hoc retrieval. We perfor...
Jaap Kamps
SIGMOD
2009
ACM
140views Database» more  SIGMOD 2009»
15 years 8 months ago
Robust web extraction: an approach based on a probabilistic tree-edit model
On script-generated web sites, many documents share common HTML tree structure, allowing wrappers to effectively extract information of interest. Of course, the scripts and thus ...
Nilesh N. Dalvi, Philip Bohannon, Fei Sha