A website can regulate search engine crawler access to its content using the robots exclusion protocol, specified in its robots.txt file. The rules in the protocol enable the site...
We propose novel algorithms for organizing large image and video datasets using both the visual content and the associated sideinformation, such as time, location, authorship, and...
Web browser history detection using CSS visited styles has long been dismissed as an issue of marginal impact. However, due to recent changes in Web usage patterns, coupled with br...
Selecting and presenting content culled from multiple heterogeneous and physically distributed sources is a challenging task. The exponential growth of the web data in modern time...
As the Internet is a global network, there is a demand on accessing closely related data without browsing through di erent Web documents. A signi cant amount of these data are pre...