In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued s...
Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz
: Nowadays, personalized education is a very hot topic in technology enhanced learning (TEL) research. To support students during their learning process, the first step consists in...
Valentin Butoianu, Philippe Vidal, Katrien Verbert...
While several hierarchical classification methods have been applied to web content, such techniques invariably rely on a pre-defined taxonomy of documents. We propose a new techni...
As more information becomes available electronically, tools for finding information of interest to users becomes increasingly important. The goal of the research described here is...
Eric Bloedorn, Inderjeet Mani, T. Richard MacMilla...
Because of name variations, an author may have multiple names and multiple authors may share the same name. Such name ambiguity affects the performance of document retrieval, web ...