With the advent of open source software repositories the data available for defect prediction in source files increased tremendously. Although traditional statistics turned out t...
Constrained pattern mining extracts patterns based on their individual merit. Usually this results in far more patterns than a human expert or a machine learning technique could m...
Measuring distance or some other form of proximity between objects is a standard data mining tool. Connection subgraphs were recently proposed as a way to demonstrate proximity be...
Detection of near duplicate documents is an important problem in many data mining and information filtering applications. When faced with massive quantities of data, traditional d...
Aleksander Kolcz, Abdur Chowdhury, Joshua Alspecto...
We study the mining of interesting patterns in the presence of numerical attributes. Instead of the usual discretization methods, we propose the use of rank based measures to scor...