Sciweavers

6 search results - page 1 / 2
» CETR: content extraction via tag ratios
Sort
View
WWW
2010
ACM
13 years 11 months ago
CETR: content extraction via tag ratios
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Tim Weninger, William H. Hsu, Jiawei Han
DEXAW
2008
IEEE
123views Database» more  DEXAW 2008»
13 years 10 months ago
Text Extraction from the Web via Text-to-Tag Ratio
– We describe a method to extract content text from diverse Web pages by using the HTML document’s Text-to-Tag Ratio rather than specific HTML cues that may not be constant acr...
Tim Weninger, William H. Hsu
WWW
2006
ACM
14 years 4 months ago
Improved annotation of the blogosphere via autotagging and hierarchical clustering
Tags have recently become popular as a means of annotating and organizing Web pages and blog entries. Advocates of tagging argue that the use of tags produces a 'folksonomy&#...
Christopher H. Brooks, Nancy Montanez
GIS
2009
ACM
13 years 8 months ago
Conceptualization of place via spatial clustering and co-occurrence analysis
More and more users are contributing and sharing more and more contents on the Web via the use of content hosting sites and social media services. These user–generated contents ...
Dong-Po Deng, Tyng-Ruey Chuang, Rob Lemmens
GFKL
2007
Springer
152views Data Mining» more  GFKL 2007»
13 years 10 months ago
Supporting Web-based Address Extraction with Unsupervised Tagging
Abstract. The manual acquisition and modeling of tourist information as e.g. addresses of points of interest is time and, therefore, cost intensive. Furthermore, the encoded inform...
Berenike Loos, Chris Biemann