Data warehouses have been widely used in various capacities such as large corporations or public institutions. These systems contain large and rich datasets that are often used by ...
We propose a new decision tree algorithm, Class Confidence Proportion Decision Tree (CCPDT), which is robust and insensitive to class distribution and generates rules which are st...
Wei Liu, Sanjay Chawla, David A. Cieslak, Nitesh V...
Data mining applications analyze large collections of set data and high dimensional categorical data. Search on these data types is not restricted to the classic problems of minin...
— The genetic causes of many monogenic diseases have already been discovered. However, most common diseases are actually the result of complex nonlinear interactions between mult...
Many scalable data mining tasks rely on active learning to provide the most useful accurately labeled instances. However, what if there are multiple labeling sources (`oracles...