Using Document Dimensions for Enhanced Information Retrieval

10 years 8 months ago
Using Document Dimensions for Enhanced Information Retrieval
Conventional document search techniques are constrained by attempting to match individual keywords or phrases to source documents. Thus, these techniques miss out documents that contain semantically similar terms, thereby achieving a relatively low degree of recall. At the same time, processing capabilities and tools for syntactic and semantic analysis of language have advanced to the point where an indextime linguistic analysis of source documents is both feasible and realistic. In this paper, we introduce document dimensions, a means of classifying or grouping terms discovered in documents. Using an enhanced version of Jakarta Lucene[1], we demonstrate that supplementing keyword analysis with some syntactic and semantic information can indeed enhance the quality of information retrieval results.
Thimal Jayasooriya, Suresh Manandhar
Added 30 Jun 2010
Updated 30 Jun 2010
Type Conference
Year 2004
Where AACC
Authors Thimal Jayasooriya, Suresh Manandhar
Comments (0)