When humans approach the task of text categorization, they interpret the specific wording of the document in the much larger context of their background knowledge and experience. ...
This paper describes how to automatically cross-reference documents with Wikipedia: the largest knowledge base ever known. It explains how machine learning can be used to identify...
Representing documents by vectors that are independent of language enhances machine translation and multilingual text categorization. We use discriminative training to create a pr...
In recent years several models have been proposed for text categorization. Within this, one of the widely applied models is the vector space model (VSM), where independence betwee...
Wikipedia has been applied as a background knowledge base to various text mining problems, but very few attempts have been made to utilize it for document clustering. In this pape...
Anna Huang, David N. Milne, Eibe Frank, Ian H. Wit...