Audio-visual speaker diarization using fisher linear semi-discriminant analysis

10 years 2 days ago

Download cbl.uh.edu

Speaker diarization aims to automatically answer the question “who spoke when” given a speech signal. In this work, we have focused on applying the FLSD approach, a semi-supervised version of Fisher Linear Discriminant analysis, both in the audio and the video signals to form a complete multimodal speaker diarization system. Extensive experiments have proven that the FLSD method boosts the performance of the face diarization task (i.e. the task of discovering faces over time given only the visual signal). In addition, we have proven through experimentation that applying the FLSD method for discriminating between faces is also independent of the initial feature space and remains relatively unaffectedasthenumberof faces increases. Finally, a fusion method is proposed that leads to performance improvement in comparison to the best individual modality, which is the audio signal. Keywords Speaker diarization · FLsD · FLD · Audio-visual fusion

Nikolaos Sarafianos, Theodoros Giannakopoulos, Ser

Real-time Traffic

Hardware | MTA 2016 |

claim paper

Post Info
More Details (n/a)

Added	08 Apr 2016
Updated	08 Apr 2016
Type	Journal
Year	2016
Where	MTA
Authors	Nikolaos Sarafianos, Theodoros Giannakopoulos, Sergios Petridis

Comments (0)

Sciweavers

Audio-visual speaker diarization using fisher linear semi-discriminant analysis

Hardware | MTA 2016 |

Explore & Download

Productivity Tools

Sciweavers