Sparse coding of auditory features for machine hearing in interference

12 years 8 months ago

Download mirlab.org

A key problem in using the output of an auditory model as the input to a machine-learning system in a machine-hearing application is to ﬁnd a good feature-extraction layer. For systems such as PAMIR (passive–aggressive model for image retrieval) that work well with a large sparse feature vector, a conversion from auditory images to sparse features is needed. For audio-ﬁle ranking and retrieval from text queries, based on stabilized auditory images, we took a multi-scale approach, using vector quantization to choose one sparse feature in each of many overlapping regions of different scales, with the hope that in some regions the features for a sound would be stable even when other interfering sounds were present and affecting other regions. We recently extended our testing of this approach using sound mixtures, and found that the sparse-coded auditory-image features degrade less in interference than vector-quantized MFCC sparse features do. This initial success suggests that our ...

Richard F. Lyon, Jay Ponte, Gal Chechik

Real-time Traffic

Auditory Images | ICASSP 2011 | Images | Signal Processing | Sparse Features |

claim paper

Post Info
More Details (n/a)

Added	20 Aug 2011
Updated	20 Aug 2011
Type	Journal
Year	2011
Where	ICASSP
Authors	Richard F. Lyon, Jay Ponte, Gal Chechik

Comments (0)

Sciweavers

Sparse coding of auditory features for machine hearing in interference

Auditory Images | ICASSP 2011 | Images | Signal Processing | Sparse Features |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers