In this paper, we describe a new multi-purpose audio-visual database on the context of speech interfaces for controlling household electronic devices. The database comprises speec...
— The focus of this paper is mental tension detection in speech to assist control the tension in day-to-day business such as conferences and operations in a call center. It is di...
In this paper, we introduce a system that synthesizes the emotional audio-visual speech for a 3-D talking agent by adopting the PAD (Pleasure-Arousal-Dominance) emotional model. A ...
Update of acoustic and language models is vital to maintain performance of automatic speech recognition (ASR) systems. To alleviate efforts for updating models, we propose a "...
Yuya Akita, Masato Mimura, Graham Neubig, Tatsuya ...
Abstract. Systems for keyword and non-linguistic vocalization detection in conversational agent applications need to be robust with respect to background noise and different speak...