Sciweavers

ISNN
2011
Springer

Robust Multi-stream Keyword and Non-linguistic Vocalization Detection for Computationally Intelligent Virtual Agents

12 years 8 months ago
Robust Multi-stream Keyword and Non-linguistic Vocalization Detection for Computationally Intelligent Virtual Agents
Abstract. Systems for keyword and non-linguistic vocalization detection in conversational agent applications need to be robust with respect to background noise and different speaking styles. Focussing on the Sensitive Artificial Listener (SAL) scenario which involves spontaneous, emotionally colored speech, this paper proposes a multi-stream model that applies the principle of Long Short-Term Memory to generate contextsensitive phoneme predictions which can be used for keyword detection. Further, we investigate the incorporation of noisy training material in order to create noise robust acoustic models. We show that both strategies can improve recognition performance when evaluated on spontaneous human-machine conversations as contained in the SEMAINE database.
Martin Wöllmer, Erik Marchi, Stefano Squartin
Added 15 Sep 2011
Updated 15 Sep 2011
Type Journal
Year 2011
Where ISNN
Authors Martin Wöllmer, Erik Marchi, Stefano Squartini, Björn Schuller
Comments (0)