Search engines are the primary gateways of information access on the Web today. Behind the scenes, search engines crawl the Web to populate a local indexed repository of Web pages...
RANK AGGREGATION is important in many areas ranging from web search over databases to bioinformatics. The underlying decision problem KEMENY SCORE is NP-complete even in case of fo...
A web search with double checking model is proposed to explore the web as a live corpus. Five association measures including variants of Dice, Overlap Ratio, Jaccard, and Cosine, ...
We study methods to initialize or bias different clustering methods using prior information about the "importance" of a keyword w.r.t. the whole document collection or a...
We describe our first participation in TREC. We only competed in the Question Answering (QA) category and limited our runs to factoids. Our approach was to use our open domain QA ...
Dmitri Roussinov, Yin Ding, Jose Antonio Robles-Fl...