Sciweavers

SP
2008
IEEE

Casting out Demons: Sanitizing Training Data for Anomaly Sensors

13 years 10 months ago
Casting out Demons: Sanitizing Training Data for Anomaly Sensors
The efficacy of Anomaly Detection (AD) sensors depends heavily on the quality of the data used to train them. Artificial or contrived training data may not provide a realistic view of the deployment environment. Most realistic data sets are dirty; that is, they contain a number of attacks or anomalous events. The size of these high-quality training data sets makes manual removal or labeling of attack data infeasible. As a result, sensors trained on this data can miss attacks and their variations. We propose extending the training phase of AD sensors (in a manner agnostic to the underlying AD algorithm) to include a sanitization phase. This phase generates multiple models conditioned on small slices of the training data. We use these “micromodels” to produce provisional labels for each training input, and we combine the micro-models in a voting scheme to determine which parts of the training data may represent attacks. Our results suggest that this phase automatically and signi...
Gabriela F. Cretu, Angelos Stavrou, Michael E. Loc
Added 01 Jun 2010
Updated 01 Jun 2010
Type Conference
Year 2008
Where SP
Authors Gabriela F. Cretu, Angelos Stavrou, Michael E. Locasto, Salvatore J. Stolfo, Angelos D. Keromytis
Comments (0)