Sciweavers

10 search results - page 1 / 2
» A densitometric analysis of web template content
Sort
View
WWW
2009
ACM
14 years 5 months ago
A densitometric analysis of web template content
What makes template content in the Web so special that we need to remove it? In this paper I present a large-scale aggregate analysis of textual Web content, corroborating statist...
Christian Kohlschütter
DOCENG
2010
ACM
13 years 3 months ago
From templates to schemas: bridging the gap between free editing and safe data processing
In this paper we present tools that provide an easy way to edit XML content directly on the web, with the usual benefit of valid XML content. These tools make it possible to crea...
Vincent Quint, Cécile Roisin, Stépha...
WWW
2005
ACM
14 years 5 months ago
The volume and evolution of web page templates
Web pages contain a combination of unique content and template material, which is present across multiple pages and used primarily for formatting, navigation, and branding. We stu...
David Gibson, Kunal Punera, Andrew Tomkins
SAC
2006
ACM
13 years 10 months ago
Template detection for large scale search engines
Templates in web sites hurt search engine retrieval performance, especially in content relevance and link analysis. Current template removal methods suffer from processing speed ...
Liang Chen, Shaozhi Ye, Xing Li
DOCENG
2009
ACM
13 years 11 months ago
Web document text and images extraction using DOM analysis and natural language processing
: © Web Document Text and Images Extraction using DOM Analysis and Natural Language Processing Parag Mulendra Joshi, Sam Liu HP Laboratories HPL-2009-187 Web page text extraction,...
Parag Mulendra Joshi, Sam Liu