Audiovisual Discrimination Between Speech and Laughter: Why and When Visual Information Might Help

15 years 16 days ago

Download ibug.doc.ic.ac.uk

Past research on automatic laughter classiﬁcation / detection has focused mainly on audio-based approaches. Here we present an audiovisual approach to distinguishing laughter from speech and we show that integrating the information from audio and video channels may lead to improved performance over single-modal approaches. Both audio and visual channels consist of two streams (cues), facial expressions and head pose for video, and cepstral and prosodic features for audio. Two types of experiments were performed: 1) subject-independent cross-validation on the AMI dataset, and 2) cross-database experiments on the AMI and SAL datasets. We experimented with different combinations of cues with the most informative being the combination of facial expressions, cepstral and prosodic features. Our results suggest that the performance of the audiovisual approach is better on average than single-modal approaches. The addition of visual information produces better results when it comes to femal...

Stavros Petridis, Maja Pantic

Real-time Traffic

AMI Dataset | Sal Dataset | TMM 2011 | Visual Information |

claim paper

Post Info
More Details (n/a)

Added	15 May 2011
Updated	15 May 2011
Type	Journal
Year	2011
Where	TMM
Authors	Stavros Petridis, Maja Pantic

Comments (0)

Sciweavers

Audiovisual Discrimination Between Speech and Laughter: Why and When Visual Information Might Help

AMI Dataset | Sal Dataset | TMM 2011 | Visual Information |

Explore & Download

Productivity Tools

Sciweavers