Sciweavers

INTERSPEECH
2010
13 years 3 days ago
Binary coding of speech spectrograms using a deep auto-encoder
This paper reports our recent exploration of the layer-by-layer learning strategy for training a multi-layer generative model of patches of speech spectrograms. The top layer of t...
Li Deng, Michael L. Seltzer, Dong Yu, Alex Acero, ...
INTERSPEECH
2010
13 years 3 days ago
Efficient combined approach for named entity recognition in spoken language
We focus in this paper on the named entity recognition task in spoken data. The proposed approach investigates the use of various contexts of the words to improve recognition. Exp...
Azeddine Zidouni, Sophie Rosset, Hervé Glot...
INTERSPEECH
2010
13 years 3 days ago
Data pruning for template-based automatic speech recognition
In this paper we describe and analyze a data pruning method in combination with template-based automatic speech recognition. We demonstrate the positive effects of polishing the t...
Dino Seppi, Dirk Van Compernolle
INTERSPEECH
2010
13 years 3 days ago
Efficient HMM-based estimation of missing features, with applications to packet loss concealment
In this paper, we present efficient HMM-based techniques for estimating missing features. By assuming speech features to be observations of hidden Markov processes, we derive a mi...
Bengt J. Borgström, Per Henrik Borgström...
INTERSPEECH
2010
13 years 3 days ago
Say what? why users choose to speak their web queries
The context in which a speech-driven application is used (or conversely not used) can be an important signal for recognition engines, and for spoken interface design. Using large-...
Maryam Kamvar, Doug Beeferman
INTERSPEECH
2010
13 years 3 days ago
Wiktionary as a source for automatic pronunciation extraction
In this paper, we analyze whether dictionaries from the World Wide Web which contain phonetic notations, may support the rapid creation of pronunciation dictionaries within the sp...
Tim Schlippe, Sebastian Ochs, Tanja Schultz
INTERSPEECH
2010
13 years 3 days ago
Strategies for statistical spoken language understanding with small amount of data - an empirical study
The semantic frame based spoken language understanding involves two decisions
Ye-Yi Wang
INTERSPEECH
2010
13 years 3 days ago
Active appearance models for photorealistic visual speech synthesis
The perceived quality of a synthetic visual speech signal greatly depends on the smoothness of the presented visual articulators. This paper explains how concatenative visual spee...
Wesley Mattheyses, Lukas Latacz, Werner Verhelst
INTERSPEECH
2010
13 years 3 days ago
Coping imbalanced prosodic unit boundary detection with linguistically-motivated prosodic features
Continuous speech input for ASR processing is usually presegmented into speech stretches by pauses. In this paper, we propose that smaller, prosodically defined units can be ident...
Yi-Fen Liu, Shu-Chuan Tseng, Jyh-Shing Roger Jang,...
INTERSPEECH
2010
13 years 3 days ago
The impact of ASR on abstractive vs. extractive meeting summaries
Gabriel Murray, Giuseppe Carenini, Raymond T. Ng