Sciweavers

INTERSPEECH
2010
12 years 11 months ago
Multimodal speaker diarization using oriented optical flow histograms
Speaker diarization is the task of partitioning an input stream into speaker homogeneous regions, or in other words, to determine "who spoke when." While approaches to t...
Mary Tai Knox, Gerald Friedland
INTERSPEECH
2010
12 years 11 months ago
Lexical entrainment of real users in the let's go spoken dialog system
This paper examines the lexical entrainment of real users in the Let's Go spoken dialog system. First it presents a study of the presence of entrainment in a year of human-tr...
Gabriel Parent, Maxine Eskenazi
INTERSPEECH
2010
12 years 11 months ago
Emotion recognition using imperfect speech recognition
This paper investigates the use of speech-to-text methods for assigning an emotion class to a given speech utterance. Previous work shows that an emotion extracted from text can c...
Florian Metze, Anton Batliner, Florian Eyben, Tim ...
INTERSPEECH
2010
12 years 11 months ago
Detection of hot spots in poster conversations based on reactive tokens of audience
We present a novel scheme for indexing "hot spots" in conversations, such as poster sessions, based on the reaction of the audience. Specifically, we focus on laughters ...
Tatsuya Kawahara, Kouhei Sumi, Zhi-Qiang Chang, Ka...
INTERSPEECH
2010
12 years 11 months ago
Evaluation of speaker mimic technology for personalizing SGD voices
In this paper, we demonstrate the use of state-of-the-art speech technology to transform speech from a source speaker to mimic a particular target speaker with the intention of pr...
Esther Klabbers, Alexander Kain, Jan P. H. van San...
INTERSPEECH
2010
12 years 11 months ago
Investigation of full-sequence training of deep belief networks for speech recognition
Recently, Deep Belief Networks (DBNs) have been proposed for phone recognition and were found to achieve highly competitive performance. In the original DBNs, only framelevel info...
Abdel-rahman Mohamed, Dong Yu, L. Deng
INTERSPEECH
2010
12 years 11 months ago
Improving monaural speaker identification by double-talk detection
This paper describes a novel approach to improve monoaural speaker identification where two speakers are present in a single-microphone recording. The goal is to identify both of ...
Rahim Saeidi, Pejman Mowlaee, Tomi Kinnunen, Zheng...
INTERSPEECH
2010
12 years 11 months ago
Combination of probabilistic and possibilistic language models
In a previous paper we proposed Web-based language models relying on the possibility theory. These models explicitly represent the possibility of word sequences. In this paper we ...
Stanislas Oger, Vladimir Popescu, Georges Linar&eg...