Sciweavers

2189 search results - page 234 / 438
» Webbed documents
Sort
View
ELPUB
2008
ACM
15 years 6 months ago
The state of metadata in open access journals: possibilities and restrictions
This paper reports on an inquiry into the use of metadata, publishing formats, and markup in editormanaged open access journals. It builds on findings from a study of the document...
Helena Francke
EDBTW
2010
Springer
15 years 2 months ago
Using visual pages analysis for optimizing web archiving
Due to the growing importance of the World Wide Web, archiving it has become crucial for preserving useful source of information. To maintain a web archive up-to-date, crawlers ha...
Myriam Ben Saad, Stéphane Gançarski
SIGMOD
2009
ACM
140views Database» more  SIGMOD 2009»
15 years 11 months ago
Robust web extraction: an approach based on a probabilistic tree-edit model
On script-generated web sites, many documents share common HTML tree structure, allowing wrappers to effectively extract information of interest. Of course, the scripts and thus ...
Nilesh N. Dalvi, Philip Bohannon, Fei Sha
FLAIRS
2007
15 years 6 months ago
Lexicon Development and POS Tagging Using a Tagged Bengali News Corpus
Lexicon development and Part of Speech (POS) tagging are very important for almost all Natural Language Processing(NLP) application areas. The rapid development of these resources...
Asif Ekbal, Sivaji Bandyopadhyay
WEBI
2005
Springer
15 years 9 months ago
A Method of Web Search Result Clustering Based on Rough Sets
Due to the enormous size of the web and low precision of user queries, finding the right information from the web can be difficult if not impossible. One approach that tries to ...
Chi Lang Ngo, Hung Son Nguyen