Blind Speech Separation in a Meeting Situation with Maximum SNR Beamformers

15 years 6 months ago

Download www.tara.tsukuba.ac.jp

We propose a speech separation method for a meeting situation, where each speaker sometimes speaks and the number of speakers changes every moment. Many source separation methods have already been proposed, however, they consider a case where all the speakers keep speaking: this is not always true in a real meeting. In such cases, in addition to separation, speech detection and the classiﬁcation of the detected speech according to speaker become important issues. For that purpose, we propose a method that employs a maximum signal-to-noise (MaxSNR) beamformer combined with a voice activity detector and online clustering. We also discuss the scaling ambiguity problem as regards the MaxSNR beamformer, and provide their solutions. We report some encouraging results for a real meeting in a room with a reverberation time of about 350 ms.

Shoko Araki, Hiroshi Sawada, Shoji Makino

Real-time Traffic

ICASSP 2007 | Real Meeting | Separation Methods | Signal Processing | Speech Separation Method |

claim paper

Added	02 Jun 2010
Updated	02 Jun 2010
Type	Conference
Year	2007
Where	ICASSP
Authors	Shoko Araki, Hiroshi Sawada, Shoji Makino

Sciweavers

Blind Speech Separation in a Meeting Situation with Maximum SNR Beamformers

ICASSP 2007 | Real Meeting | Separation Methods | Signal Processing | Speech Separation Method |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers