In this paper we present our technique for finding semantically similar clusters within web documents obtained from a set of queries retrieved from the Google search engine. This ...
In this paper we analyze the Web coverage of three search engines, Google, Yahoo and MSN. We conducted a 15 month study collecting 15,770 Web content or information pages linked f...
Yang Sok Kim, Byeong Ho Kang, Paul Compton, Hirosh...
Web spam is a widely-recognized threat to the quality and security of the Web. Web spam pages pollute search engine indexes, burden Web crawlers and Web mining services, and expos...
We study the problem of efficiently computing diverse query results in online shopping applications, where users specify queries through a form interface that allows a mix of stru...
Erik Vee, Utkarsh Srivastava, Jayavel Shanmugasund...
How to rank Web resources is critical to Web Resource Discovery (Search Engine). This paper not only points out the weakness of current approaches, but also presents in-depth anal...