Speech communication consists of three steps: production, transmission, and hearing. Every step inevitably involves acoustic distortions due to gender differences, age, microphone...
Speech recognition transcripts are far from perfect; they are not of sufficient quality to be useful on their own for spoken document retrieval. This is especially the case for c...
Update of acoustic and language models is vital to maintain performance of automatic speech recognition (ASR) systems. To alleviate efforts for updating models, we propose a "...
Yuya Akita, Masato Mimura, Graham Neubig, Tatsuya ...
The production of closed captions is an important but expensive process in video broadcasting. We propose a method to generate highly accurate off-line captions efficiently. Our s...
Prosodic information has been successfully used for speaker recognition for more than a decade. The best-performing prosodic system to date has been one based on features extracte...
Luciana Ferrer, Nicolas Scheffer, Elizabeth Shribe...