A speech separation system is described in which sources are represented in a joint interaural time difference-fundamental frequency (ITD-F0) cue space. Traditionally, recurrent t...
We present an approach to detecting and recognizing spoken isolated phrases based solely on visual input. We adopt an architecture that first employs discriminative detection of ...
Kate Saenko, Karen Livescu, Michael Siracusa, Kevi...
Large vocabulary automatic speech recognition (ASR) technologies perform well in known and controlled contexts. In less controlled conditions, however, human review is often neces...
The standard approach to speaker verification is to extract cepstral features from the speech spectrum and model them by generative or discriminative techniques. We propose a nov...
act 11 We analyze and compare two different methods for unsupervised extractive spontaneous speech summarization in the meeting 12 domain. Based on utterance comparison, we introdu...