Sciweavers

ICASSP
2009
IEEE

Fusing short term and long term features for improved speaker diarization

13 years 11 months ago
Fusing short term and long term features for improved speaker diarization
The following article shows how a state-of-the-art speaker diarization system can be improved by combining traditional short-term features (MFCCs) with prosodic and other longterm features. First, we present a framework to study the speaker discriminability of 70 different long-term features. Then, we show how the top-ranked long-term features can be combined with short-term features to increase the accuracy of speaker diarization. The results were measured on standardized data sets (NIST RT) and show a consistent improvement of about 30 % relative in diarization error rate compared to the best system presented at the NIST evaluation in 2007. This result was also verified on a wide set of meetings, which we call CombDev, that contains 21 meetings from previous evaluations. Since the prosodic and long-term features were selected using a diarization-independent speakerdiscriminability study, we are confident that the same features are able to improve other systems that perform similar...
Gerald Friedland, Oriol Vinyals, C. Yan Huang, Chr
Added 21 May 2010
Updated 21 May 2010
Type Conference
Year 2009
Where ICASSP
Authors Gerald Friedland, Oriol Vinyals, C. Yan Huang, Christian Müller
Comments (0)