Automatic classification of documents is an important area of research with many applications in the fields of document searching, forensics and others. Methods to perform classif...
Genre or style analysis can be used to improve results achieved using standard IR techniques. A genre class is a group of documents that are written in a similar style. Genre clas...
Many applications in text and speech processing require the analysis of distributions of variable-length sequences. We recently introduced a general kernel framework, rational ker...
In this paper, we study the problem of learning block classification models to estimate block functions. We distinguish general models, which are learned across multiple sites, an...
This paper introduced the four tracks that WIM-Lab Fudan University had taken part in at TREC 2007. For spam track, a multi-centre model was proposed considering the characteristi...
Jun Xu, Jing Yao, Jiaqian Zheng, Qi Sun, Junyu Niu