Sciweavers

563 search results - page 13 / 113
» Crawling the web for structured documents
Sort
View
122
Voted
WIKIS
2006
ACM
15 years 7 months ago
SweetWiki: semantic web enabled technologies in Wiki
Wikis are social web sites enabling a potentially large number of participants to modify any page or create a new page using their web browser. As they grow, wikis may suffer from...
Michel Buffa, Fabien Gandon
ICWE
2005
Springer
15 years 7 months ago
Identifying Websites with Flow Simulation
We present in this paper a method to discover the set of webpages contained in a logical website, based on the link structure of the Web graph. Such a method is useful in the conte...
Pierre Senellart
111
Voted
WWW
2003
ACM
16 years 2 months ago
Dynamic maintenance of web indexes using landmarks
Recent work on incremental crawling has enabled the indexed document collection of a search engine to be more synchronized with the changing World Wide Web. However, this synchron...
Lipyeow Lim, Min Wang, Sriram Padmanabhan, Jeffrey...
WWW
2002
ACM
16 years 2 months ago
Using web structure for classifying and describing web pages
The structure of the web is increasingly being used to improve organization, search, and analysis of information on the web. For example, Google uses the text in citing documents ...
Eric J. Glover, Kostas Tsioutsiouliklis, Steve Law...
CLEF
2005
Springer
15 years 7 months ago
EuroGOV: Engineering a Multilingual Web Corpus
EuroGOV is a multilingual web corpus that was created to serve as the document collection for WebCLEF, the CLEF 2005 web retrieval task. EuroGOV is a collection of web pages crawl...
Börkur Sigurbjörnsson, Jaap Kamps, Maart...