Sciweavers

1577 search results - page 149 / 316
» Data Mining: Machine Learning, Statistics, and Databases
Sort
View
MSR
2006
ACM
15 years 7 months ago
Predicting defect densities in source code files with decision tree learners
With the advent of open source software repositories the data available for defect prediction in source files increased tremendously. Although traditional statistics turned out t...
Patrick Knab, Martin Pinzger, Abraham Bernstein
ICDM
2007
IEEE
96views Data Mining» more  ICDM 2007»
15 years 8 months ago
The Chosen Few: On Identifying Valuable Patterns
Constrained pattern mining extracts patterns based on their individual merit. Usually this results in far more patterns than a human expert or a machine learning technique could m...
Björn Bringmann, Albrecht Zimmermann
KDD
2006
ACM
122views Data Mining» more  KDD 2006»
16 years 2 months ago
Measuring and extracting proximity in networks
Measuring distance or some other form of proximity between objects is a standard data mining tool. Connection subgraphs were recently proposed as a way to demonstrate proximity be...
Yehuda Koren, Stephen C. North, Chris Volinsky
KDD
2004
ACM
195views Data Mining» more  KDD 2004»
16 years 2 months ago
Improved robustness of signature-based near-replica detection via lexicon randomization
Detection of near duplicate documents is an important problem in many data mining and information filtering applications. When faced with massive quantities of data, traditional d...
Aleksander Kolcz, Abdur Chowdhury, Joshua Alspecto...
KDD
2006
ACM
123views Data Mining» more  KDD 2006»
16 years 2 months ago
Mining rank-correlated sets of numerical attributes
We study the mining of interesting patterns in the presence of numerical attributes. Instead of the usual discretization methods, we propose the use of rank based measures to scor...
Toon Calders, Bart Goethals, Szymon Jaroszewicz