This paper takes phonetic information into account for data alignment in text-independent voice conversion. Hidden Markov Models are used for representing the phonetic structure o...
Meng Zhang, Jiaohua Tao, Jani Nurminen, Jilei Tian...
A key problem in video content analysis using dynamic graphical models is to learn a suitable model structure given some observed visual data. We propose a Completed Likelihood AI...
Motion trajectories provide rich spatio-temporal information about an object's activity. The trajectory information can be obtained using a tracking algorithm on data streams ...
Faisal I. Bashir, Ashfaq A. Khokhar, Dan Schonfeld
We propose a robust scene recognition framework using scene context information for multimedia contents. Multimedia contents consist of scene sequences that are more likely to hap...
Abstract. Long Short-Term Memory (LSTM) recurrent neural networks (RNNs) are local in space and time and closely related to a biological model of memory in the prefrontal cortex. N...