Large corpora are essential to modern methods of computational linguistics and natural language processing. In this paper, we describe an ongoing project whose aim is to build a l...
This paper compares the efficacy and efficiency of different clustering approaches for selecting a set of exemplar images, to present in the context of a semantic concept. We eval...
Despite the success of web search engines, search over large enterprise intranets still suffers from poor result quality. Earlier work [6] that compared intranets and the Internet...
Digitizing ancient books, especially those related to the humanities, is practiced in many countries. The number of full-text databases in the humanities is increasing. Studies hav...
This paper proposes a design for our entry into the 2006 AAAI Scavenger Hunt Competition and Robot Exhibition. We will be entering a scalable two agent system consisting of off-th...