The common paradigm of searching and retrieving information on the Web is based on keyword-based search using one or more search engines, and then browsing through the large numbe...
Web text has been successfully used as training data for many NLP applications. While most previous work accesses web text through search engine hit counts, we created a Web Corpu...
The yellow pagesservice of GTE SuperPages enables Web users to flexibly search through liitings of 11 million businessesin over 17000 categories. To achievethe flexibility desired...
Steven D. Whitehead, Himanshu Sinha, Michael Murph...
The Hypertext-based Webs such as Intranets contain a vast amount of information pertaining to an enormous number of subjects. It is, however, an organically grown and thus essentia...
Just as email spam has negatively impacted the user messaging experience, the rise of Web spam is threatening to severely degrade the quality of information on the World Wide Web....