Given a huge online social network, how do we retrieve information from it through crawling? Even better, how do we improve the crawling performance by using parallel crawlers tha...
Duen Horng Chau, Shashank Pandit, Samuel Wang, Chr...
Available methodologies for developing Sematic Web applications do not fully exploit the whole potential deriving from interaction with ontological data sources. Here we introduce...
The web crawler space is often delimited into two general areas: full-web crawling and focused crawling. We present netSifter, a crawler system which integrates features from thes...
We present an empirical evaluation and comparison of two content extraction methods in HTML: absolute XPath expressions and relative XPath expressions. We argue that the relative ...
Marek Kowalkiewicz, Maria E. Orlowska, Tomasz Kacz...
In bridging the digital divide, two important criteria are cost-effectiveness, and power optimization. While 802.11 is cost-effective and is being used in several installations in...