The problem of document classification considers categorizing or grouping of various document types. Each document can be represented as a bag of words, which has no straightforw...
Abstract. Previous researches on advanced representations for document retrieval have shown that statistical state-of-the-art models are not improved by a variety of different ling...
Abstract. A major characteristic of text document categorization problems is the extremely high dimensionality of text data. In this paper we explore the usability of the Oscillati...
We address the problem of integrating web taxonomies from different real Internet applications. Integrating web taxonomies is to transfer instances from a source to target taxonom...
Chia-Wei Wu, Richard Tzong-Han Tsai, Cheng-Wei Lee...
This paper presents a statistical model for discovering topical clusters of words in unstructured text. The model uses a hierarchical Bayesian structure and it is also able to iden...