In this paper, we review two techniques for topic discovery in collections of text documents (Latent Semantic Indexing and K-Means clustering) and present how we integrated them in...
This paper suggests an alternative solution for the task of spoken document retrieval (SDR). The proposed system runs retrieval on multi-level transcriptions (word and phone) prod...
Shan Jin, Hemant Misra, Thomas Sikora, Joemon M. J...
Algorithms that enable the process of automatically mining distinct topics in document collections have become increasingly important due to their applications in many fields and ...
We propose a new method to select relevant images to the given keywords from images gathered from the Web based on the Probabilistic Latent Semantic Analysis (PLSA) model which is...
Segmentation of document images remains a challenging vision problem. Although document images have a structured layout, capturing enough of it for segmentation can be difficult....