Sciweavers

SPL
2016

Voice Activity Detection: Merging Source and Filter-based Information

8 years 17 days ago
Voice Activity Detection: Merging Source and Filter-based Information
—Voice Activity Detection (VAD) refers to the problem of distinguishing speech segments from background noise. Numerous approaches have been proposed for this purpose. Some are based on features derived from the power spectral density, others exploit the periodicity of the signal. The goal of this paper is to investigate the joint use of source and filter-based features. For this purpose, we consider features already used in the literature on VAD, as well as some new source-related features. Interestingly, a mutual information-based assessment shows superior discrimination power for the source-related features, especially the proposed ones. The features are further the input of an artificial neural network-based classifier trained on a multi-condition database. Two strategies are proposed to merge source and filter information: feature and decision fusion. Our experiments indicate an absolute reduction of 3% of the equal error rate when using decision fusion. The final proposed ...
Thomas Drugman, Yannis Stylianou, Yusuke Kida, Mas
Added 10 Apr 2016
Updated 10 Apr 2016
Type Journal
Year 2016
Where SPL
Authors Thomas Drugman, Yannis Stylianou, Yusuke Kida, Masami Akamine
Comments (0)