In Web 2.0, users have generated and shared massive amounts of resources in various media formats, such as news, blogs, audios, photos and videos. The abundance and diversity of t...
Chen Liu, Beng Chin Ooi, Anthony K. H. Tung, Dongx...
Abstract. Analysis of aggregate and individual Web requests shows that PageRank is a poor predictor of traffic. We use empirical data to characterize properties of Web traffic not ...
A great deal of information on the Web is represented in both textual and structured form. The structured form is machinereadable and can be used to augment the textual data. We c...
Automatic categorization of user queries is an important component of general purpose (Web) search engines, particularly for triggering rich, query-specific content and sponsored ...
In order to artificially boost the rank of commercial pages in search engine results, search engine optimizers pay for links to these pages on other websites. Identifying paid lin...