As online document collections continue to expand, both on the Web and in proprietary environments, the need for duplicate detection becomes more critical. The goal of this work i...
Forming test collection relevance judgments from the pooled output of multiple retrieval systems has become the standard process for creating resources such as the TREC, CLEF, and...
When applied to real-world problems, the powerful optimization tool of Evolutionary Algorithms frequently turns out to be too time-consuming due to elaborate fitness calculations t...
Finding relationships between entities on the Web, e.g., the connections between different places or the commonalities of people, is a novel and challenging problem. Existing Web ...
This paper presents a hybrid concept hierarchy development technique for web returned documents retrieved by a meta-search engine. The aim of the technique is to separate the init...
Razvan Stefan Bot, Yi-fang Brook Wu, Xin Chen, Qua...