Speaker characterization using spectral subband energy ratio based on Harmonic plus Noise Model

12 years 8 months ago

Download mirlab.org

This paper proposes a feature extraction for speaker characterization by exploring the relationship between the two distinct components of the speech signal, one is harmonics accounting for the periodicity of the signal and the other is modulated noise accounting for the turbulences of the glottal airﬂow. The harmonic and noise parts of the speech signal are decomposed based on the Harmonic plus Noise Model approach. We estimate the spectral subband energy ratios (SSERs) as the speaker characteristic features, which are expected to reﬂect the interaction property of the vocal tract and glottal airﬂow of individual speakers for speaker veriﬁcation. The speaker veriﬁcation experiments based on a GMM-UBM system have shown the efﬁciency of the SSER features, reducing the error equal rate by 27.2% by combining with the conventional MFCC features.

Yanhua Long, Zhi-Jie Yan, Frank K. Soong, Li-Rong

Real-time Traffic

Glottal Airﬂow | ICASSP 2011 | Signal Processing | Speaker Veriﬁcation | Speech Signal |

claim paper

Post Info
More Details (n/a)

Added	20 Aug 2011
Updated	20 Aug 2011
Type	Journal
Year	2011
Where	ICASSP
Authors	Yanhua Long, Zhi-Jie Yan, Frank K. Soong, Li-Rong Dai, Wu Guo

Comments (0)

Sciweavers

Speaker characterization using spectral subband energy ratio based on Harmonic plus Noise Model

Glottal Airﬂow | ICASSP 2011 | Signal Processing | Speaker Veriﬁcation | Speech Signal |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers