Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

43

ISMIR
2000
Springer

favoriteEmaildiscussreport

168views Music» more ISMIR 2000»

Mel Frequency Cepstral Coefficients for Music Modeling

14 years 28 days ago

Mel Frequency Cepstral Coefficients for Music Modeling

Download ciir.cs.umass.edu

We examine in some detail Mel Frequency Cepstral Coefficients (MFCCs) - the dominant features used for speech recognition - and investigate their applicability to modeling music. In particular, we examine two of the main assumptions of the process of forming MFCCs: the use of the Mel frequency scale to model the spectra; and the use of the Discrete Cosine Transform (DCT) to decorrelate the Mel-spectral vectors. We examine the first assumption in the context of speech/music discrimination. Our results show that the use of the Mel scale for modeling music is at least not harmful for this problem, although further experimentation is needed to verify that this is the optimal scale in the general case. We investigate the second assumption by examining the basis vectors of the theoretically optimal transform to decorrelate music and speech spectral vectors. Our results demonstrate that the use of the DCT to decorrelate vectors is appropriate for both speech and music spectra. MFCCs for Musi...

Beth Logan

Real-time Traffic

ISMIR 2000 | Mel Frequency Cepstral Coefficients | Music | Speech Recognition |

claim paper

Related Content

» The MelFrequency Cepstral Coefficients in the Context of Singer Identification

» Minimum MeanSquared Error Estimation of MelFrequency Cepstral Coefficients Using a Novel D...

» Identifying Perceptually Similar Languages Using Teager Energy Based Cepstrum

» Accommodating sample size effect on similarity measures in speaker clustering

» An Investigation of Feature Models for Music Genre Classification Using the Support Vector...

» Temporal Events in All Dimensions and Scales

» Maximum Likelihood and Maximum Mutual Information Training in Gender and Age Recognition S...

» Robust Analysis and Weighting on MFCC Components for Speech Recognition and Speaker Identi...

» The AIT Multimodal Person Identification System for CLEAR 2007

Post Info
More Details (n/a)

Added	25 Aug 2010
Updated	25 Aug 2010
Type	Conference
Year	2000
Where	ISMIR
Authors	Beth Logan

Comments (0)