Generating Expressive Summaries for Speech and Musical Audio using Self-Similarity Clues

15 years 11 months ago

Download www.cecs.uci.edu

We present a novel algorithm for structural analysis of audio to detect repetitive patterns that are suitable for content-based audio information retrieval systems, since repetitive patterns can provide valuable information about the content of audio, such as a chorus or a concept. The Audio Spectrum Flatness (ASF) feature of the MPEG7 standard, although not having been considered as much as other feature types, has been utilized and evaluated as the underlying feature set. Expressive summaries are chosen as the longest patterns by the k-means clustering algorithm. Proposed approach is evaluated on a test bed consisting of popular song and speech clips based on the ASF feature. The well known Mel Frequency Cepstral Coefﬁcients (MFCCs) are also considered in the experiments for the evaluation of features. Experiments show that, all the repetitive patterns and their locations are obtained with the accuracy of 93% and 78% for music and speech, respectively.

Mustafa Sert, Buyurman Baykal, Adnan Yazici

Real-time Traffic

Audio Spectrum Flatness | Content-based Audio Information | ICMCS 2006 | Repetitive Patterns |

claim paper

Post Info
More Details (n/a)

Added	11 Jun 2010
Updated	11 Jun 2010
Type	Conference
Year	2006
Where	ICMCS
Authors	Mustafa Sert, Buyurman Baykal, Adnan Yazici

Comments (0)

Sciweavers

Generating Expressive Summaries for Speech and Musical Audio using Self-Similarity Clues

Audio Spectrum Flatness | Content-based Audio Information | ICMCS 2006 | Repetitive Patterns |

Explore & Download

Productivity Tools

Sciweavers