For categorical data there does not exist any similarity measure which is as straight forward and general as the numerical distance between numerical items. Due to this it is ofte...
We explore in this paper the efficient clustering of item data. Different from those of the traditional data, the features of item data are known to be of high dimensionality and...
Across a wide variety of fields, huge datasets are being collected and accumulated at a dramatical pace. The datasets addressed by individual applications are very often heteroge...
Free tree, as a special graph which is connected, undirected and acyclic, is extensively used in domains such as computational biology, pattern recognition, computer networks, XML...
In this paper, we present a new co-training strategy that makes use of unlabelled data. It trains two predictors in parallel, with each predictor labelling the unlabelled data for...