Most existing algorithms for clinical risk stratification rely on labeled training data. Collecting this data is challenging for clinical conditions where only a small percentage ...
Microarray datasets typically contain large number of columns but small number of rows. Association rules have been proved to be useful in analyzing such datasets. However, most e...
Gao Cong, Anthony K. H. Tung, Xin Xu, Feng Pan, Ji...
Although text categorization is a burgeoning area of IR research, readily available test collections in this field are surprisingly scarce. We describe a methodology and system (...
We propose a new method to partition an unlabeled dataset, called Discriminative Context Partitioning (DCP). It is motivated by the idea of splitting the dataset based only on how...
Previous research has shown that faceted browsing is effective and enjoyable in searching and browsing large collections of data. In this work, we explore the efficacy of interact...
Bongshin Lee, Greg Smith, George G. Robertson, Mar...