Multimodal Speaker Segmentation in Presence of Overlapped Speech Segments

15 years 3 months ago

Download sail.usc.edu

We propose a multimodal speaker segmentation algorithm with two main contributions: First, we suggest a hidden Markov model architecture that performs fusion of the three modalities: a multi-camera system for participant localization, a microphone array for speaker localization, and a speaker identiﬁcation system; Second, we present a novel method for dealing with overlapped speech segments through a likelihood model of the microphone array observations that uses multiple local maxima of the Steered Power Response Generalized Cross Correlation Phase Transform (SPR-GCC-PHAT) function in the Joint Probabilistic Data Association (JPDA) framework. Results show that the proposed method outperforms standard speaker segmentation systems based on: (a) speaker identiﬁcation and; (b) microphone array processing, for datasets with the signiﬁcant portion (27.4%) of overlapped speech, and scores as high as 94.4% on the F-measure scale.

Viktor Rozgic, Kyu Jeong Han, Panayiotis G. Georgi

Real-time Traffic

ISM 2008 | Microphone Array | Multimedia | Speaker Segmentation | Speaker Segmentation Algorithm |

claim paper

» Annotation and analysis of overlapping speech in political interviews

» From Searching to Browsing through Multimodal Documents Linking

» Analysis User Interface and their Evaluation for Student Presentation Videos

» Speaker change detection using joint audiovisual statistics

» Towards a Multimodal Meeting Record

» A joint particle filter for audiovisual speaker tracking

» Probabalistic Models and Informative Subspaces for Audiovisual Correspondence

» Multimodal Meeting Tracker

Post Info
More Details (n/a)

Added	31 May 2010
Updated	31 May 2010
Type	Conference
Year	2008
Where	ISM
Authors	Viktor Rozgic, Kyu Jeong Han, Panayiotis G. Georgiou, Shrikanth S. Narayanan

Comments (0)

Sciweavers

Multimodal Speaker Segmentation in Presence of Overlapped Speech Segments

ISM 2008 | Microphone Array | Multimedia | Speaker Segmentation | Speaker Segmentation Algorithm |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers