When is it safe to use synthetic data in supervised classification? Trainable classifier technologies require large representative training sets consisting of samples labeled with...
Abstract. This paper deals with the characterization of data complexity and the relationship with the classification accuracy. We study three dimensions of data complexity: the len...
Management of power in data centers is driven by the need to not exceed circuit capacity. The methods employed in the oversight of these power circuits are typically static and ad...
We consider a kernel-based approach to nonlinear classification that coordinates the generation of “synthetic” points (to be used in the kernel) with “chunking” (working wi...
Abstract--Imbalanced data sets present a particular challenge to the data mining community. Often, it is the rare event that is of interest and the cost of misclassifying the rare ...