Unsupervised content discovery in composite audio

12 years 2 days ago
Unsupervised content discovery in composite audio
Automatically extracting semantic content from audio streams can be helpful in many multimedia applications. Motivated by the known limitations of traditional supervised approaches to content extraction, which are hard to generalize and require suitable training data, we propose in this paper an unsupervised approach to discover and categorize semantic content in a composite audio stream. In our approach, we first employ spectral clustering to discover natural semantic sound clusters in the analyzed data stream (e.g. speech, music, noise, applause, speech mixed with music, etc.). These clusters are referred to as audio elements. Based on the obtained set of audio elements, the key audio elements, which are most prominent in characterizing the content of input audio data, are selected and used to detect potential boundaries of semantic audio segments denoted as auditory scenes. Finally, the auditory scenes are categorized in terms of the audio elements appearing therein. Categorization...
Rui Cai, Lie Lu, Alan Hanjalic
Added 26 Jun 2010
Updated 26 Jun 2010
Type Conference
Year 2005
Where MM
Authors Rui Cai, Lie Lu, Alan Hanjalic
Comments (0)