With the increasing amount of data and the need to integrate data from multiple data sources, a challenging issue is to find near duplicate records efficiently. In this paper, we ...
Chuan Xiao, Wei Wang 0011, Xuemin Lin, Jeffrey Xu ...
Data on the Web is increasingly being used for discovery and exploratory tasks. Unlike traditional fact-finding tasks that require only the typical single-query and response parad...
Electronic mail poses a number of unusual challenges for the design of information retrieval systems and test collections, including informal expression, conversational structure,...
The Internet is increasingly used as a medium for providing medical information. Nevertheless, whether the World Wide Web is favoured over other information sources depends to a l...
We present a novel language modeling approach to capturing the query reformulation behavior of Web search users. Based on a framework that categorizes eight different types of “...