Using an open-source, Java toolkit of name-matching methods, we experimentally compare string distance metrics on the task of matching entity names. We investigate a number of dif...
William W. Cohen, Pradeep D. Ravikumar, Stephen E....
In this paper we deal with the problem of addition of new documents in collection when documents are represented in lower dimensional space by concept indexing. Concept indexing i...
This paper describes a method for asking statistical questions about a large text corpus. We exemplify the method by addressing the question, "What percentage of Federal Regi...
10 There have been recent applications of genetic algorithms to information retrieval, mostly with respect to 11 relevance feedback. Nevertheless, they are yet to be evaluated in ...
Often scientists seek to search for articles on the Web related to a particular chemical. When a scientist searches for a chemical formula using a search engine today, she gets ar...
Bingjun Sun, Qingzhao Tan, Prasenjit Mitra, C. Lee...