Sciweavers

PADL
2009
Springer

Using Bloom Filters for Large Scale Gene Sequence Analysis in Haskell

14 years 5 months ago
Using Bloom Filters for Large Scale Gene Sequence Analysis in Haskell
Analysis of biological data often involves large data sets and computationally expensive algorithms. Databases of biological data continue to grow, leading to an increasing demand for improved algorithms and data structures. Despite having many advantages over more traditional indexing structures, the Bloom filter is almost unused in bioinformatics. Here we present a robust and efficient Bloom filter implementation in Haskell, and implement a simple bioinformatics application for indexing and matching sequence data. We use this to index the chromosomes that make up the human genome, and map all available gene sequences to it. Our experiences with developing and tuning our application suggest that for bioinformatics applications, Haskell offers a compelling combination of rapid development, quality assurance, and high performance.
Ketil Malde, Bryan O'Sullivan
Added 22 Nov 2009
Updated 22 Nov 2009
Type Conference
Year 2009
Where PADL
Authors Ketil Malde, Bryan O'Sullivan
Comments (0)