Duplicate URLs have brought serious troubles to the whole pipeline of a search engine, from crawling, indexing, to result serving. URL normalization is to transform duplicate URLs...
Tao Lei, Rui Cai, Jiang-Ming Yang, Yan Ke, Xiaodon...
Relational search is a novel paradigm of search which focuses on the similarity between semantic relations. Given three words (A, B, C) as the query, a relational search engine ret...
Similarity search in metric spaces is a general paradigm that can be used in several application fields. It can also be effectively exploited in content-based image retrieval syst...
As a good complement to page content, anchor texts have been extensively used, and proven to be useful, in commercial search engines. However, anchor texts have been assumed to be...
Zhicheng Dou, Ruihua Song, Jian-Yun Nie, Ji-Rong W...
Information needs are rarely satisfied directly on search engine result pages. Searchers usually need to click through to search results (landing pages) and follow search trails b...