Sciweavers

ECIR
2004
Springer

Identification of Relevant and Novel Sentences Using Reference Corpus

13 years 5 months ago
Identification of Relevant and Novel Sentences Using Reference Corpus
In the novelty task on sentence level, the amount of information used in similarity computation is the major challenging issue. A shallow NLP approach extracts noun and verb features from a topic description and the given sentences, and expands these features with WordNet. The similarity is measured by the number of word matching between topic and sentence representations. Alternatively, a topic and a given sentence can be considered as queries to a reference corpus, and in terms of weighting vectors of document lists returned by IR systems. A rigid procedure is proposed to determine dynamic relevance thresholds. Besides reference corpus, relevant sentences are retrieved from given sentences directly. The corpus-based approach with dynamic thresholds outperforms shallow NLP approach and direct retrieval approach. In the novelty detection, two sentences are regarded as similar if they are related to the similar document lists returned by IR systems. One of similar sentences will be sel...
Hsin-Hsi Chen, Ming-Feng Tsai, Ming-Hung Hsu
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2004
Where ECIR
Authors Hsin-Hsi Chen, Ming-Feng Tsai, Ming-Hung Hsu
Comments (0)