Despite its widespread use, voicemail presents numerous usability challenges: People must listen to messages in their entirety, they cannot search by keywords, and audio files do ...
This paper presents a gesture based driven approach for video editing. Given a lecture video, we adopt novel approaches to automatically detect and synchronize its content with el...
Editing speech data is currently time-consuming and errorprone. Speech editors rely on acoustic waveform representations, which force users to repeatedly sample the underlying spe...
We present an overview of the data collection and transcription efforts for the COnversational Speech In Noisy Environments (COSINE) corpus. The corpus is a set of multi-party con...
Alex Stupakov, Evan Hanusa, Jeff A. Bilmes, Dieter...
This paper presents an interactive visualisation system that assists users of semi-automatic speech transcription systems to assess alternative recognition results in real time an...
Saturnino Luz, Masood Masoodian, Bill Rogers, Bo Z...