Many real-world datasets can be clustered along multiple dimensions. For example, text documents can be clustered not only by topic, but also by the author's gender or sentim...
In this paper, we present a new co-training strategy that makes use of unlabelled data. It trains two predictors in parallel, with each predictor labelling the unlabelled data for...
Frequent disjunctive pattern is known to be a sophisticated method of text mining in a single document that satisfies anti-monotonicity, by which we can discuss efficient algorith...
In this paper, we focus on classifying documents according to opinion and value judgment they contain. The main originality of our approach is to combine linguistic pre-processing,...
Web applications increasingly utilize search techniques that heavily rely on content-based text and image analyses. For example, for parental site filtering, it is necessary to id...