Sciweavers

142 search results - page 2 / 29
» Extracting data records from the web using tag path clusteri...
Sort
View
APWEB
2006
Springer
13 years 8 months ago
Image Description Mining and Hierarchical Clustering on Data Records Using HR-Tree
Since we can hardly get semantics from the low-level features of the image, it is much more difficult to analyze the image than textual information on the Web. Traditionally, textu...
Congle Zhang, Sheng Huang, Gui-Rong Xue, Yong Yu
AUSDM
2006
Springer
160views Data Mining» more  AUSDM 2006»
13 years 8 months ago
Extraction of Flat and Nested Data Records from Web Pages
This paper deals with studies the problem of identification and extraction of flat and nested data records from a given web page. With the explosive growth of information sources ...
Siddu P. Algur, P. S. Hiremath
AH
2008
Springer
13 years 11 months ago
Collection Browsing through Automatic Hierarchical Tagging
In order to navigate huge document collections efficiently, tagged hierarchical structures can be used. For users, it is important to correctly interpret tag combinations. In this ...
Korinna Bade, Marcel Hermkes
WWW
2007
ACM
14 years 5 months ago
U-REST: an unsupervised record extraction system
In this paper, we describe a system that can extract record structures from web pages with no direct human supervision. Records are commonly occurring HTML-embedded data tuples th...
Yuan Kui Shen, David R. Karger
DEXAW
2008
IEEE
123views Database» more  DEXAW 2008»
13 years 11 months ago
Text Extraction from the Web via Text-to-Tag Ratio
– We describe a method to extract content text from diverse Web pages by using the HTML document’s Text-to-Tag Ratio rather than specific HTML cues that may not be constant acr...
Tim Weninger, William H. Hsu