Sciweavers

10 search results - page 1 / 2
» A densitometric analysis of web template content
Sort
View
65
Voted
WWW
2009
ACM
15 years 10 months ago
A densitometric analysis of web template content
What makes template content in the Web so special that we need to remove it? In this paper I present a large-scale aggregate analysis of textual Web content, corroborating statist...
Christian Kohlschütter
DOCENG
2010
ACM
14 years 8 months ago
From templates to schemas: bridging the gap between free editing and safe data processing
In this paper we present tools that provide an easy way to edit XML content directly on the web, with the usual benefit of valid XML content. These tools make it possible to crea...
Vincent Quint, Cécile Roisin, Stépha...
WWW
2005
ACM
15 years 10 months ago
The volume and evolution of web page templates
Web pages contain a combination of unique content and template material, which is present across multiple pages and used primarily for formatting, navigation, and branding. We stu...
David Gibson, Kunal Punera, Andrew Tomkins
SAC
2006
ACM
15 years 3 months ago
Template detection for large scale search engines
Templates in web sites hurt search engine retrieval performance, especially in content relevance and link analysis. Current template removal methods suffer from processing speed ...
Liang Chen, Shaozhi Ye, Xing Li
DOCENG
2009
ACM
15 years 3 months ago
Web document text and images extraction using DOM analysis and natural language processing
: © Web Document Text and Images Extraction using DOM Analysis and Natural Language Processing Parag Mulendra Joshi, Sam Liu HP Laboratories HPL-2009-187 Web page text extraction,...
Parag Mulendra Joshi, Sam Liu