Sciweavers

TMM
2002

Narrowing the semantic gap - improved text-based web document retrieval using visual features

13 years 3 months ago
Narrowing the semantic gap - improved text-based web document retrieval using visual features
In this paper, we present the results of our work that seek to negotiate the gap between low-level features and high-level concepts in the domain of web document retrieval. This work concerns a technique, latent semantic indexing (LSI), which has been used for textual information retrieval for many years. In this environment, LSI determines clusters of co-occurring keywords-sometimes called concepts--so that a query which uses a particular keyword can then retrieve documents perhaps not containing this keyword, but containing other keywords from the same cluster. In this paper, we examine the use of this technique for content-based web document retrieval, using both keywords and image features to represent the documents. Two different approaches to image feature representation, namely, color histograms and color anglograms, are adopted and evaluated. Experimental results show that LSI, together with both textual and visual features, is able to extract the underlying semantic structure ...
Rong Zhao, William I. Grosky
Added 23 Dec 2010
Updated 23 Dec 2010
Type Journal
Year 2002
Where TMM
Authors Rong Zhao, William I. Grosky
Comments (0)