In a trend that reflects the increasing demand for intelligent applications driven by business data, IBM today is building out a significant number of applications that leverage m...
This paper deals with an unusual phenomenon where most machine learning algorithms yield good performance on the training set but systematically worse than random performance on th...
Redescription mining is a newly introduced data mining problem that seeks to find subsets of data that afford multiple definitions. It can be viewed as a generalization of associa...
A collection consisting of the images of 774 live moth individuals, each moth belonging to one of 35 different UK species, was analysed to determine if data mining techniques could...
Statistical machine learning techniques for data classification usually assume that all entities are i.i.d. (independent and identically distributed). However, real-world entities...