In this study, we crawled a local Web domain, created its graph representation, and analyzed the network structure. The results of network analysis revealed local scalefree patter...
Many web documents (such as JAVA FAQs) are being replicated on the Internet. Often entire document collections (such as hyperlinked Linux manuals) are being replicated many times....
In this paper, we report on a large-scale study of structural differences among the national webs. The study is based on a webscale crawl conducted in the summer 2008. More specif...
Sukwon Chung, Dungjit Shiowattana, Pavel Dmitriev,...
We study in this paper the Web forum crawling problem, which is a very fundamental step in many Web applications, such as search engine and Web data mining. As a typical user-crea...
Rui Cai, Jiang-Ming Yang, Wei Lai, Yida Wang, Lei ...
One of the key components of current Web search engines is the document collector. This paper describes CoBWeb, an automatic document collector, whose architecture is distributed ...
Altigran Soares da Silva, Eveline A. Veloso, Paulo...