Sciweavers

RAID
2010
Springer

On Challenges in Evaluating Malware Clustering

13 years 2 months ago
On Challenges in Evaluating Malware Clustering
Malware clustering and classification are important tools that enable analysts to prioritize their malware analysis efforts. The recent emergence of fully automated methods for malware clustering and classification that report high accuracy suggests that this problem may largely be solved. In this paper, we report the results of our attempt to confirm our conjecture that the method of selecting ground-truth data in prior evaluations biases their results toward high accuracy. To examine this conjecture, we apply clustering algorithms from a different domain (plagiarism detection), first to the dataset used in a prior work’s evaluation and then to a wholly new malware dataset, to see if clustering algorithms developed without attention to subtleties of malware obfuscation are nevertheless successful. While these studies provide conflicting signals as to the correctness of our conjecture, our investigation of possible reasons uncovers, we believe, a cautionary note regarding the si...
Peng Li, Limin Liu, Debin Gao, Michael K. Reiter
Added 30 Jan 2011
Updated 30 Jan 2011
Type Journal
Year 2010
Where RAID
Authors Peng Li, Limin Liu, Debin Gao, Michael K. Reiter
Comments (0)