When performing concept description, models need to be evaluated both on accuracy and comprehensibility. A comprehensible concept description model should present the most importan...
Corruption of data by class-label noise is an important practical concern impacting many classification problems. Studies of data cleaning techniques often assume a uniform label ...
There has been an information explosion in fields of science such as high energy physics, astronomy, environmental sciences and biology. There is a critical need for automated sys...
Srinath Shankar, Ameet Kini, David J. DeWitt, Jeff...
Duplicate detection is the problem of detecting different entries in a data source representing the same real-world entity. While research abounds in the realm of duplicate detect...
In this paper, block diagonal linear discriminant analysis (BDLDA) is improved and applied to gene expression data. BDLDA is a classification tool with embedded feature selection...
Lingyan Sheng, Roger Pique-Regi, Shahab Asgharzade...