Sciweavers

SIGIR
2008
ACM

Knowledge transformation from word space to document space

13 years 4 months ago
Knowledge transformation from word space to document space
In most IR clustering problems, we directly cluster the documents, working in the document space, using cosine similarity between documents as the similarity measure. In many real-world applications, however, we usually have knowledge on the word side and wish to transform this knowledge to the document (concept) side. In this paper, we provide a mechanism for this knowledge transformation. To the best of our knowledge, this is the first model for such type of knowledge transformation. This model uses a nonnegative matrix factorization model X = FSGT , where X is the worddocument semantic matrix, F is the posterior probability of a word belonging to a word cluster and represents knowledge in the word space, G is the posterior probability of a document belonging to a document cluster and represents knowledge in the document space, and S is a scaled matrix factor which provides a condensed view of X. We show how knowledge on words can improve document clustering, i.e, knowledge in the w...
Tao Li, Chris H. Q. Ding, Yi Zhang 0005, Bo Shao
Added 15 Dec 2010
Updated 15 Dec 2010
Type Journal
Year 2008
Where SIGIR
Authors Tao Li, Chris H. Q. Ding, Yi Zhang 0005, Bo Shao
Comments (0)