Many blog posts deal with current issues, so much attention has been paid to identifying topic trends in blogs. This paper suggests a new metric of selecting topic words. We empiri...
Il-Chul Moon, Young-Min Kim, Hyun-Jong Lee, Alice ...
Database selection is an important step when searching over large numbers of distributed text databases. The database selection task relies on statistical summaries of the databas...
Documents, especially long ones, may contain very diverse passages related to different topics. Passages Retrieval approaches have shown that, in most cases, there is a great pote...
Sylvain Lamprier, Tassadit Amghar, Bernard Levrat,...
A major challenge in document clustering is the extremely high dimensionality. For example, the vocabulary for a document set can easily be thousands of words. On the other hand, ...
We investigate whether dimensionality reduction using a latent generative model is beneficial for the task of weakly supervised scene classification. In detail we are given a set ...