In this paper, we investigate the role of a biomedical dataset on the classification accuracy of an algorithm. We quantify the complexity of a biomedical dataset using five complex...
This paper discusses improving the methodology introduced in Kushmerick’s paper about learning to remove internet advertisements. The aim is to reduce the model build time as we...
In this paper we describe work relating to classification of web documents using a graph-based model instead of the traditional vector-based model for document representation. We ...
Adam Schenker, Mark Last, Horst Bunke, Abraham Kan...
The current research on association rule based text classification neglected several key problems. First, weights of elements in profile vectors may have much impact on generating ...
Jiangtao Qiu, Changjie Tang, Tao Zeng, Shaojie Qia...
— Identifying faulty classes in object-oriented software is one of the important software quality assurance activities. This paper empirically investigates the application of t...
This paper discusses two sets of automatic musical genre classification experiments. Promising research directions are then proposed based on the results of these experiments. The...
Combining multiple classifiers is of particular interest in multimedia applications. Each modality in multimedia data can be analyzed individually, and combining multiple pieces of...
When automatically extracting information from the world wide web, most established methods focus on spotting single HTMLdocuments. However, the problem of spotting complete web s...
Martin Ester, Hans-Peter Kriegel, Matthias Schuber...
Recently, mining data streams with concept drifts for actionable insights has become an important and challenging task for a wide range of applications including credit card fraud...
We address the problem of integrating documents from different sources into a master catalog. This problem is pervasive in web marketplaces and portals. Current technology for aut...