Audio-based classification of speaker characteristics

15 years 1 months ago

Download www.aquaphoenix.com

The human voice is primarily a carrier of speech, but it also contains non-linguistic features unique to a speaker and indicative of various speaker demographics, e.g. gender, nativity, ethnicity. Such characteristics are helpful cues for audio/video search and retrieval. In this paper, we evaluate the effects of various low-, mid-, and high-level features for effective classification of speaker characteristics. Low-level signal-based features include MFCCs, LPCs, and six spectral features; mid-level statistical features model lowlevel features; and high-level semantic features are based on selected phonemes in addition to mid-level features. Our data set consists of approximately 76.4 hours of annotated audio with 2786 unique speaker segments used for classification. Quantitative evaluation of our method results in accuracy rates up to 98.6% on our test data for male/female classification using mid-level features and a linear kernel support vector machine. We determine that mid- and ...

Promiti Dutta, Alexander Haubold

Real-time Traffic

ICMCS 2009 | Mid-level Features | Mid-level Statistical Features | Multimedia | Speaker Characteristics |

claim paper

» Genetically optimised feedforward neural networks for speaker identification

» Probabilistic SVMGMM Classifier for SpeakerIndependent Vowel Recognition in Continues Spee...

» EmotionSense a mobile phones based adaptive platform for experimental social psychology re...

» Design of a Multimodal Database for Research on Automatic Detection of Severe Apnoea Cases

Post Info
More Details (n/a)

Added	19 Feb 2011
Updated	19 Feb 2011
Type	Journal
Year	2009
Where	ICMCS
Authors	Promiti Dutta, Alexander Haubold

Comments (0)

Sciweavers

Audio-based classification of speaker characteristics

ICMCS 2009 | Mid-level Features | Mid-level Statistical Features | Multimedia | Speaker Characteristics |

Explore & Download

Productivity Tools

Sciweavers