Overall performance of the data mining process depends not just on the value of the induced knowledge but also on various costs of the process itself such as the cost of acquiring...
Background: Proteomic data obtained from mass spectrometry have attracted great interest for the detection of early-stage cancer. However, as mass spectrometry data are high-dimen...
Data quality is critical for many information-intensive applications. One of the best opportunities to improve data quality is during entry. USHER provides a theoretical, data-dri...
Kuang Chen, Joseph M. Hellerstein, Tapan S. Parikh
Background: Bioinformatics data analysis toolbox needs general-purpose, fast and easily interpretable preprocessing tools that perform data integration during exploratory data ana...
This paper is concerned with the problem of Imbalanced Classification (IC) in web mining, which often arises on the web due to the "Matthew Effect". As web IC applicatio...