Collecting supervised training data for automatic speech recognition (ASR) systems is both time consuming and expensive. In this paper we use the notion of virtual evidence in a g...
Automatic sentence segmentation of spoken language is an important precursor to downstream natural language processing. Previous studies combine lexical and prosodic features, but...
Recent research has shown that speech can be sparsely represented using a dictionary of speech segments spanning multiple frames, exemplars, and that such a sparse representation ...
The human voice is primarily a carrier of speech, but it also contains non-linguistic features unique to a speaker and indicative of various speaker demographics, e.g. gender, nat...
Automatic image annotation techniques that try to identify the objects in images usually need the images to be segmented first, especially when specifically annotating image reg...