The indexation of documents is a critical step of the information retrieval process and is often a manual task which highly depends on the indexer’s knowledge. We propose to imp...
Encouraged by a significant improvement over LSI (latent semantic indexing) approach in textual information retrieval of the DLSI (differential latent semantic indexing) approach ...
In this paper, we show how we can learn to select good words for a document title. We view the problem of selecting good title words for a document as a variant of an Information ...
The techniques of information retrieval and information extraction are complementary, but to date there has been little concrete work aimed at integrating the two. We describe how...
Abstract. This paper suggests a novel representation for documents that is intended to improve precision. This representation is generated by combining two central techniques: Rand...