The detection of repeated subsequences, time series motifs, is a problem which has been shown to have great utility for several higher-level data mining algorithms, including clas...
Abstract. To make accurate recommendations, recommendation systems currently require more data about a customer than is usually available. We conjecture that the weaknesses are due...
Internet users regularly have the need to find biographies and facts of people of interest. Wikipedia has become the first stop for celebrity biographies and facts. However, Wik...
Xiaojiang Liu, Zaiqing Nie, Nenghai Yu, Ji-Rong We...
We introduce a new EM framework in which it is possible not only to optimize the model parameters but also the number of model components. A key feature of our approach is that we...
In many data mining applications, online labeling feedback is only available for examples which were predicted to belong to the positive class. Such applications include spam filt...