The rapid growth of the World-Wide Web poses unprecedented scaling challenges for general-purpose crawlers and search engines. In this paper we describe a new hypertext resource d...
Soumen Chakrabarti, Martin van den Berg, Byron Dom
The inherent lack of control over the Internet content resulted in proliferation of online material that can be potentially detrimental. For example, the infamous “Anarchist Coo...
It has become a promising direction to measure similarity of Web search queries by mining the increasing amount of clickthrough data logged by Web search engines, which record the...
Qiankun Zhao, Steven C. H. Hoi, Tie-Yan Liu, Soura...
Automatic extraction of semantic information from text and links in Web pages is key to improving the quality of search results. However, the assessment of automatic semantic meas...
Ana Gabriela Maguitman, Filippo Menczer, Heather R...
XML is an emerging standard for data representation and exchange on the World-Wide Web. Due to the nature of information on the Web and the inherent flexibility of XML, we expect...