Typical transmission control protocol (TCP) based web agents in mobile wireless environment have several deficiencies, such as performance degradation, head-of-line blocking, and u...
Broder et al.’s [3] shingling algorithm and Charikar’s [4] random projection based approach are considered “state-of-theart” algorithms for finding near-duplicate web pag...
The Deep Web is the collection of information repositories that are not indexed by search engines. These repositories are typically accessible through web forms and contain dynami...
Several initiatives for establishing standards for metadata models are being carried out at the moment, but everyone focuses on their own requirements when defining metadata attri...
Missing web pages, URIs that return the 404 “Page Not Found” error or the HTTP response code 200 but dereference unexpected content, are ubiquitous in today’s browsing exper...
Martin Klein, Jeffery L. Shipman, Michael L. Nelso...