Missing feature speech recognition in a meeting situation with maximum SNR beamforming

15 years 5 months ago

Download www.tara.tsukuba.ac.jp

Abstract— Especially for tasks like automatic meeting transcription, it would be useful to automatically recognize speech also while multiple speakers are talking simultaneously. For this purpose, speech separation can be performed, for example by using maximum SNR beamforming. However, even when good interferer suppression is attained, the interfering speech will still be recognizable during those intervals, where the target speaker is silent. In order to avoid the consequential insertion errors, a new soft masking scheme is proposed, which works in the time domain by inducing a large damping on those temporal periods, where the observed direction of arrival does not correspond to that of the target speaker. Even though the masking scheme is aggressive, by means of missing feature recognition the recognition accuracy can be improved signiﬁcantly, with relative error reductions in the order of 60% compared to maximum SNR beamforming alone, and it is successful also for three simult...

Dorothea Kolossa, Shoko Araki, Marc Delcroix, Tomo

Real-time Traffic

Hardware | ISCAS 2008 | Masking Scheme | Maximum Snr | Target Speaker |

claim paper

Added	31 May 2010
Updated	31 May 2010
Type	Conference
Year	2008
Where	ISCAS
Authors	Dorothea Kolossa, Shoko Araki, Marc Delcroix, Tomohiro Nakatani, Reinhold Orglmeister, Shoji Makino

Sciweavers

Missing feature speech recognition in a meeting situation with maximum SNR beamforming

Hardware | ISCAS 2008 | Masking Scheme | Maximum Snr | Target Speaker |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers