Sciweavers

563 search results - page 11 / 113
» Crawling the web for structured documents
Sort
View
IV
2007
IEEE
105views Visualization» more  IV 2007»
15 years 8 months ago
Mapping a Local Web Domain
In this study, we crawled a local Web domain, created its graph representation, and analyzed the network structure. The results of network analysis revealed local scalefree patter...
Weimao Ke, Tiago Simas
SIGMOD
2000
ACM
85views Database» more  SIGMOD 2000»
15 years 6 months ago
Finding Replicated Web Collections
Many web documents (such as JAVA FAQs) are being replicated on the Internet. Often entire document collections (such as hyperlinked Linux manuals) are being replicated many times....
Junghoo Cho, Narayanan Shivakumar, Hector Garcia-M...
WWW
2009
ACM
16 years 2 months ago
The web of nations
In this paper, we report on a large-scale study of structural differences among the national webs. The study is based on a webscale crawl conducted in the summer 2008. More specif...
Sukwon Chung, Dungjit Shiowattana, Pavel Dmitriev,...
110
Voted
WWW
2008
ACM
16 years 2 months ago
iRobot: an intelligent crawler for web forums
We study in this paper the Web forum crawling problem, which is a very fundamental step in many Web applications, such as search engine and Web data mining. As a typical user-crea...
Rui Cai, Jiang-Ming Yang, Wei Lai, Yida Wang, Lei ...
SPIRE
1999
Springer
15 years 6 months ago
CoBWeb - A Crawler for the Brazilian Web
One of the key components of current Web search engines is the document collector. This paper describes CoBWeb, an automatic document collector, whose architecture is distributed ...
Altigran Soares da Silva, Eveline A. Veloso, Paulo...