Incorporating spectral subtraction and noise type for unvoiced speech segregation

15 years 10 months ago

Download www.cse.ohio-state.edu

Unvoiced speech poses a big challenge to current monaural speech segregation systems. It lacks harmonic structure and is highly susceptible to interference due to its relatively weak energy. This paper describes a new approach to segregate unvoiced speech from nonspeech interference. The system first estimates a voiced binary mask, and then performs unvoiced speech segregation in two stages: segmentation and grouping. In segmentation, timefrequency units labeled as 0 in the voiced binary mask are first used to estimate the noise energy and spectral subtraction is then performed to generate time-frequency segments in unvoiced intervals. Based on the type of noise, unvoiced segments are grouped either by selecting segments consistent with those generated by onset/offset analysis or by Bayesian classification of acoustic-phonetic features. Systematic evaluation and comparison show that the proposed approach improves the performance of unvoiced speech segregation considerably.

Ke Hu, DeLiang Wang

Real-time Traffic

ICASSP 2009 | Signal Processing | Speech Segregation | Unvoiced Speech | Unvoiced Speech Segregation |

claim paper

Added	21 May 2010
Updated	21 May 2010
Type	Conference
Year	2009
Where	ICASSP
Authors	Ke Hu, DeLiang Wang

Sciweavers

Incorporating spectral subtraction and noise type for unvoiced speech segregation

ICASSP 2009 | Signal Processing | Speech Segregation | Unvoiced Speech | Unvoiced Speech Segregation |

Explore & Download

Productivity Tools

Sciweavers