In this chapter, we characterize problems for web applications, examine existing testing techniques that are potentially applicable to the web environment, and introduce a strateg...
Search engines provide search results based on a large repository of pages downloaded by a web crawler from several servers. To provide best results, this repository must be kept ...
The World-Wide Web provides remote access to pages using its own naming scheme (URLs), transfer protocol (HTTP), and cache algorithms. Not only does using these special-purpose me...
Contextual advertising on web pages has become very popular recently and it poses its own set of unique text mining challenges. Often advertisers wish to either target (or avoid) ...
Yi Zhang, Arun C. Surendran, John C. Platt, Mukund...
We describe an HTML web page segmentation algorithm, which is applied to segment online medical journal articles (regular HTML and PDF-Converted-HTML files). The web page content ...