Co-clustering is a powerful data mining technique with varied applications such as text clustering, microarray analysis and recommender systems. Recently, an informationtheoretic ...
Arindam Banerjee, Inderjit S. Dhillon, Joydeep Gho...
In this paper we investigate the general problem of discovering recurrent patterns that are embedded in categorical sequences. An important real-world problem of this nature is mo...
Entity resolution is the problem of determining which records in a database refer to the same entities, and is a crucial and expensive step in the data mining process. Interest in...
Organizing textual documents into a hierarchical taxonomy is a common practice in knowledge management. Beside textual features, the hierarchical structure of directories reflect...
Yi Huang, Kai Yu, Matthias Schubert, Shipeng Yu, V...
The diagnosis of cancer type based on microarray data offers hope that cancer classification can be highly accurate for clinicians to choose the most appropriate forms of treatmen...
Myungsook Klassen, Matt Cummings, Griselda Saldana