Web pages often contain clutter (such as pop-up ads, unnecessary images and extraneous links) around the body of an article that distracts a user from actual content. Extraction o...
Suhit Gupta, Gail E. Kaiser, David Neistadt, Peter...
Motivated by structural properties of the Web graph that support efficient data structures for in memory adjacency queries, we study the extent to which a large network can be com...
Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, ...
It is well known that anchor text plays a critical role in a variety of search tasks performed over hypertextual domains, including enterprise search, wiki search, and web search....
Donald Metzler, Jasmine Novak, Hang Cui, Srihari R...
Focused web crawlers have recently emerged as an alternative to the well-established web search engines. While the well-known focused crawlers retrieve relevant webpages, there ar...
Martin Ester, Hans-Peter Kriegel, Matthias Schuber...
The Internet and the World Wide Web provide a global virtual marketplace. However, there is little information about the behavior of e-commerce users worldwide. The goal of the pa...
Virgilio Almeida, Wagner Meira Jr., Victor F. Ribe...