This paper proposes a method to automatically extract highlight scenes from sports (baseball) live video in real time and to allow users to retrieve them. For this purpose, sophis...
We introduce a direct model for speech recognition that assumes an unstructured, i.e., flat text output. The flat model allows us to model arbitrary attributes and dependences o...
Georg Heigold, Geoffrey Zweig, Xiao Li, Patrick Ng...
Speech recognition applications are known to require a significant amount of resources (memory, computing power). However, embedded speech recognition systems, such as in mobile p...
Mohamed Bouallegue, Driss Matrouf, Georges Linares
The pal)er describes an interface between generator and synthesizer of tile German language concept-to-speech system VieCtoS. It discusses phenomena in German intonation that depe...
Hannes Pirker, Georg Niklfeld, Johannes Matiasek, ...
With the purpose of improving Spoken Language Understanding (SLU) performance, a combination of different acoustic speech recognition (ASR) systems is proposed. State a-posteriori...