Sciweavers

ICASSP
2008
IEEE
13 years 11 months ago
Unsupervised optimal phoneme segmentation: Objectives, algorithm and comparisons
Phoneme segmentation is a fundamental problem in many speech recognition and synthesis studies. Unsupervised phoneme segmentation assumes no knowledge on linguistic contents and a...
Yu Qiao, Naoya Shimomura, Nobuaki Minematsu
ICASSP
2008
IEEE
13 years 11 months ago
Extracting question/answer pairs in multi-party meetings
Understanding multi-party meetings involves tasks such as dialog act segmentation and tagging, action item extraction, and summarization. In this paper we introduce a new task for...
Andreas Kathol, Gökhan Tür
ICASSP
2008
IEEE
13 years 11 months ago
Statistical approach to vocal tract transfer function estimation based on factor analyzed trajectory HMM
In this paper, we describe a novel statistical approach to the vocal tract transfer function (VTTF) estimation of a speech signal based on a factor analyzed trajectory hidden Mark...
Tomoki Toda, Keiichi Tokuda
ICASSP
2008
IEEE
13 years 11 months ago
Modulation decompositions for the interpolation of long gaps in acoustic signals
This paper presents a modulation-based reconstruction method for audio signals across long gaps of missing samples. We use LTI filterbanks followed by a multiplicative model that...
Pascal Clark, Les E. Atlas
ICASSP
2008
IEEE
13 years 11 months ago
Speech denoising using nonnegative matrix factorization with priors
We present a technique for denoising speech using nonnegative matrix factorization (NMF) in combination with statistical speech and noise models. We compare our new technique to s...
Kevin W. Wilson, Bhiksha Raj, Paris Smaragdis, Aja...
ICASSP
2008
IEEE
13 years 11 months ago
Accurate statistical spoken language understanding from limited development resources
Robust Spoken Language Understanding (SLU) is a key component of spoken dialogue systems. Recent statistical approaches to this problem require additional resources (e.g. gazettee...
I. V. Meza-Ruiz, Sebastian Riedel, Oliver Lemon
ICASSP
2008
IEEE
13 years 11 months ago
Modeling the intonation of discourse segments for improved online dialog ACT tagging
Prosody is an important cue for identifying dialog acts. In this paper, we show that modeling the sequence of acousticprosodic values as n-gram features with a maximum entropy mod...
Vivek Kumar Rangarajan Sridhar, Shrikanth Narayana...
ICASSP
2008
IEEE
13 years 11 months ago
Is voice transformation a threat to speaker identification?
With the development of voice transformation and speech synthesis technologies, speaker identification systems are likely to face attacks from imposters who use voice transformed ...
Qin Jin, Arthur R. Toth, Alan W. Black, Tanja Schu...
ICASSP
2008
IEEE
13 years 11 months ago
Maximum conditional likelihood linear regression and maximum a posteriori for hidden conditional random fields speaker adaptatio
This paper shows how to improve Hidden Conditional Random Fields (HCRFs) for phone classification by applying various speaker adaptation techniques. These include Maximum A Poste...
Yun-Hsuan Sung, Constantinos Boulis, Daniel Jurafs...
ICASSP
2008
IEEE
13 years 11 months ago
A turbo-style algorithm for lexical baseforms estimation
In this research, an iterative and unsupervised Turbo-style algorithm is presented and implemented for the task of automatic lexical acquisition. The algorithm makes use of spoken...
Ghinwa F. Choueiter, Mesrob I. Ohannessian, Stepha...