Sciweavers

89 search results - page 3 / 18
» Using visual pages analysis for optimizing web archiving
Sort
View
IADIS
2003
13 years 6 months ago
Data Extraction from Web Database Query Result Pages via Tagsets and Integer Sequences
The World Wide Web is a collection of databases as well as web sites. Databases associated with web sites provide public access via query forms on web pages. They constitute an en...
Jerome Robinson
WWW
2007
ACM
14 years 5 months ago
Efficient search in large textual collections with redundancy
Current web search engines focus on searching only the most recent snapshot of the web. In some cases, however, it would be desirable to search over collections that include many ...
Jiangong Zhang, Torsten Suel
WWW
2003
ACM
14 years 5 months ago
Web-R: a Tool to Record & Replay Personal Web Navigation
This poster presents a useful tool to capture the content of browsing sessions. Web-R saves systematically all the components sufficient and necessary to visualize offline the pag...
Jean-Daniel Kant, Alain Lifchitz
JCDL
2006
ACM
167views Education» more  JCDL 2006»
13 years 11 months ago
Combining DOM tree and geometric layout analysis for online medical journal article segmentation
We describe an HTML web page segmentation algorithm, which is applied to segment online medical journal articles (regular HTML and PDF-Converted-HTML files). The web page content ...
Jie Zou, Daniel X. Le, George R. Thoma
ERCIMDL
2008
Springer
107views Education» more  ERCIMDL 2008»
13 years 7 months ago
Revisiting Lexical Signatures to (Re-)Discover Web Pages
A lexical signature (LS) is a small set of terms derived from a document that capture the "aboutness" of that document. A LS generated from a web page can be used to disc...
Martin Klein, Michael L. Nelson