Transductive LSI for Short Text Classification Problems

13 years 6 months ago
Transductive LSI for Short Text Classification Problems
This paper presents work that uses Transductive Latent Semantic Indexing (LSI) for text classification. In addition to relying on labeled training data, we improve classification accuracy by incorporating the set of test examples in the classification process. Rather than performing LSI's singular value decomposition (SVD) process solely on the training data, we instead use an expanded term-by-document matrix that includes both the labeled data as well as any available test examples. We report the performance of LSI on data sets both with and without the inclusion of the test examples, and we show that tailoring the SVD process to the test examples can be even more useful than adding additional training data. The test set can be a useful tool to combat the possible inclusion of unrelated data in the original corpus.
Sarah Zelikovitz
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2004
Authors Sarah Zelikovitz
Comments (0)