We investigate methods of segmenting, visualizing, and indexing presentation videos by both audio and visual data. The audio track is segmented by speaker, and augmented with key ...
Recognizing speech, gestures, and visual features are important interface capabilities for embedded mobile systems. Perception algorithms have many traits in common with more conv...
Enabling machines to understand emotions and feelings of the human users in their natural language textual input during interaction is a challenging issue in Human Computing. Our w...
Li Zhang, Marco Gillies, John A. Barnden, Robert J...
We consider the problem of PAC-learning distributions over strings, represented by probabilistic deterministic finite automata (PDFAs). PDFAs are a probabilistic model for the gen...
In the context of deployed spoken dialogue telecom services, we introduce a preprocessor called Fiction into the Spoken Language Understanding (SLU) component. It acts as an inter...