Real-world data is never perfect and can often suffer from corruptions (noise) that may impact interpretations of the data, models created from the data and decisions made based on...
Classification trees are widely used in the machine learning and data mining communities for modeling propositional data. Recent work has extended this basic paradigm to probabili...
Jennifer Neville, David Jensen, Lisa Friedland, Mi...
In relevance feedback, active learning is often used to alleviate the burden of labeling by selecting only the most informative data. Traditional data selection strategies often c...
Clustering is an essential data mining task with various types of applications. Traditional clustering algorithms are based on a vector space model representation. A relational dat...
—All sciences, including astronomy, are now entering the era of information abundance. The exponentially increasing volume and complexity of modern data sets promises to transfor...