Sciweavers

CCS
2011
ACM

BitShred: feature hashing malware for scalable triage and semantic analysis

12 years 4 months ago
BitShred: feature hashing malware for scalable triage and semantic analysis
The sheer volume of new malware found each day is growing at an exponential pace. This growth has created a need for automatic malware triage techniques that determine what malware is similar, what malware is unique, and why. In this paper, we present BitShred, a system for large-scale malware similarity analysis and clustering, and for automatically uncovering semantic inter- and intra-family relationships within clusters. The key idea behind BitShred is using feature hashing to dramatically reduce the highdimensional feature spaces that are common in malware analysis. Feature hashing also allows us to mine correlated features between malware families and samples using co-clustering techniques. Our evaluation shows that BitShred speeds up typical malware triage tasks by up to 2,365x and uses up to 82x less memory on a single CPU, all with comparable accuracy to previous approaches. We also develop a parallelized version of BitShred, and demonstrate scalability within the Hadoop frame...
Jiyong Jang, David Brumley, Shobha Venkataraman
Added 13 Dec 2011
Updated 13 Dec 2011
Type Journal
Year 2011
Where CCS
Authors Jiyong Jang, David Brumley, Shobha Venkataraman
Comments (0)