Sciweavers

820 search results - page 52 / 164
» Deep web data extraction
Sort
View
BTW
2003
Springer
140views Database» more  BTW 2003»
15 years 3 months ago
An Ontology for Domain-oriented Semantic Similarity Search on XML Data
Abstract: Query languages for XML such as XPath or XQuery support Boolean retrieval where a query result is a (possibly restructured) subset of XML elements or entire documents tha...
Anja Theobald
WWW
2010
ACM
15 years 4 months ago
CETR: content extraction via tag ratios
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Tim Weninger, William H. Hsu, Jiawei Han
AUSAI
2003
Springer
15 years 3 months ago
Information Extraction via Path Merging
Abstract. In this paper, we describe a new approach to information extraction that neatly integrates top-down hypothesis driven information with bottom-up data driven information. ...
Robert Dale, Cécile Paris, Marc Tilbrook
VLDB
2001
ACM
109views Database» more  VLDB 2001»
15 years 2 months ago
Mining Multi-Dimensional Constrained Gradients in Data Cubes
Constrained gradient analysis (similar to the “cubegrade” problem posed by Imielinski, et al. [9]) is to extract pairs of similar cell characteristics associated with big chan...
Guozhu Dong, Jiawei Han, Joyce M. W. Lam, Jian Pei...
PAKDD
2010
ACM
167views Data Mining» more  PAKDD 2010»
15 years 1 months ago
Resource-Bounded Information Extraction: Acquiring Missing Feature Values on Demand
We present a general framework for the task of extracting specific information “on demand” from a large corpus such as the Web under resource-constraints. Given a database wit...
Pallika Kanani, Andrew McCallum, Shaohan Hu