Sciweavers

ICASSP
2009
IEEE
14 years 1 days ago
Extended VTS for noise-robust speech recognition
Model compensation is a standard way of improving the robustness of speech recognition systems to noise. A number of popular schemes are based on vector Taylor series (vts) compen...
Rogier C. van Dalen, Mark J. F. Gales
ICASSP
2009
IEEE
14 years 1 days ago
Benchmarking flexible adaptive time-frequency transforms for underdetermined audio source separation
We have implemented several fast and flexible adaptive lapped orthogonal transform (LOT) schemes for underdetermined audio source separation. This is generally addressed by time-...
Andrew Nesbit, Emmanuel Vincent, Mark D. Plumbley
ICASSP
2009
IEEE
14 years 1 days ago
Generalized Baum-Welch algorithm for discriminative training on large vocabulary continuous speech recognition system
We propose a new optimization algorithm called Generalized Baum Welch (GBW) algorithm for discriminative training on hidden Markov model (HMM). GBW is based on Lagrange relaxation...
Roger Hsiao, Yik-Cheung Tam, Tanja Schultz
ICASSP
2009
IEEE
14 years 1 days ago
Cheat-proof cooperation strategies for wireless live streaming social networks
Multimedia social network analysis is an emerging research area, which analyzes the behavior of users who share multimedia content and investigates the impact of human dynamics on...
W. Sabrina Lin, H. Vicky Zhao, K. J. Ray Liu
ICASSP
2009
IEEE
14 years 1 days ago
Voice conversion for various types of body transmitted speech
In this paper, we review our proposed statistical voice conversion approaches to enhancing various types of body transmitted speech captured with non-audible murmur (NAM) micropho...
Tomoki Toda, Keigo Nakamura, Hidehiko Sekimoto, Ki...
ICASSP
2009
IEEE
14 years 1 days ago
Multi-level non-rigid image registration using graph-cuts
Non-rigid image registration is widely used in medical image analysis and image processing. It remains a challenging research problem due to its smoothness requirement and high de...
Ronald W. K. So, Albert C. S. Chung
ICASSP
2009
IEEE
14 years 1 days ago
Optimizing segment label boundaries for statistical speech synthesis
This paper introduces a new optimization technique for moving segment labels (phone and subphonetic) to optimize statistical parametric speech synthesis models. The choice of obje...
Alan W. Black, John Kominek
ICASSP
2009
IEEE
14 years 1 days ago
Instantaneous frequency rate estimation for high-order polynomial-phase signal
—Instantaneous frequency rate (IFR) estimation for high-order polynomial phase signals (PPSs) is considered. Specifically, an IFR estimator with only a second-order nonlinearity ...
Pu Wang, Hongbin Li, Igor Djurovic, Jianyu Yang