This paper presents a bottom-up approach that combines audio and video to simultaneously locate individual speakers in the video (2-D source localization) and segment their speech ...
The automatic transcription of broadcast news and meetings involves the segmentation, identification and tracking of speaker turns during each session, which is known as speaker di...
This paper discusses a set of modifications regarding the use of the Bayesian Information Criterion (BIC) for the speaker diarization task. We focus on the specific variant of the...
We describe the latest version of the SRI-ICSI meeting and lecture recognition system, as was used in the NIST RT-07 evaluations, highlighting improvements made over the last year....
Andreas Stolcke, Xavier Anguera, Kofi Boakye, &Oum...