Many applications dealing with textual information require classification of words into semantic classes (or concepts). However, manually constructing semantic classes is a tediou...
A distributed system is described that reliably mines parallel text from large corpora. The approach can be regarded as cross-language near-duplicate detection, enabled by an init...
Jakob Uszkoreit, Jay Ponte, Ashok C. Popat, Moshe ...
Background: Within the emerging field of text mining and statistical natural language processing (NLP) applied to biomedical articles, a broad variety of techniques have been deve...
In this paper, we propose a text representation model, Tensor Space Model (TSM), which models the text by multilinear algebraic high-order tensor instead of the traditional vector...
Ning Liu, Benyu Zhang, Jun Yan, Zheng Chen, Wenyin...
Associative classification, which originates from numerical data mining, has been applied to deal with text data recently. Text data is firstly digitalized to database of transact...
Baoli Li, Neha Sugandh, Ernest V. Garcia, Ashwin R...