In recent years several models have been proposed for text categorization. Within this, one of the widely applied models is the vector space model (VSM), where independence betwee...
Topics in prior-art patent search are typically full patent applications and relevant items are patents often taken from sources in different languages. Cross language patent retr...
In this paper we present a novel strategy, DragPushing, for improving the performance of text classifiers. The strategy is generic and takes advantage of training errors to succes...
Songbo Tan, Xueqi Cheng, Moustafa Ghanem, Bin Wang...
Abstract. This paper reports our comparative evaluation of three machine learning methods on Chinese text categorization. Whereas a wide range of methods have been applied to Engli...
We investigate the problem of learning document classifiers in a multilingual setting, from collections where labels are only partially available. We address this problem in the ...