In this paper, we propose a text representation model, Tensor Space Model (TSM), which models the text by multilinear algebraic high-order tensor instead of the traditional vector...
Ning Liu, Benyu Zhang, Jun Yan, Zheng Chen, Wenyin...
Abstract. This paper proposes the use of Latent Semantic Indexing (LSI) techniques, decomposed with semi-discrete matrix decomposition (SDD) method, for text categorization. The SD...
This paper presents a method for measuring the semantic similarity of texts, using corpus-based and knowledge-based measures of similarity. Previous work on this problem has focus...
Knowledge of relationships among categories is of the interest in different domains such as text classification, content analysis, and text mining. We propose and evaluate approac...
In multi-label text categorization, determining the final set of classes that will label a given document is not trivial. It implies first to determine whether a class is suitable ...