Sciweavers

SPIRE
2004
Springer

Indexing Text Documents Based on Topic Identification

13 years 10 months ago
Indexing Text Documents Based on Topic Identification
This work provides algorithms and heuristics to index text documents by determining important topics in the documents. To index text documents, the work provides algorithms to generate topic candidates, determine their importance, detect similar and synonym topics, and to eliminate incoherent topics. The indexing algorithm uses topic frequency to determine the importance and the existence of the topics. Repeated phrases are topic candidates. For example, since the phrase ‘index text documents’ occurs three this abstract, the phrase is one of the topics of this abstract. It is shown that this method is more effective than either a simple word count model or approaches based on term weighting.
Manonton Butarbutar, Susan McRoy
Added 02 Jul 2010
Updated 02 Jul 2010
Type Conference
Year 2004
Where SPIRE
Authors Manonton Butarbutar, Susan McRoy
Comments (0)