Visual tracking with moving cameras is a challenging task. The global motion induced by the moving camera moves the target object outside the expected search area, according to th...
In discriminative training, such as Maximum Mutual Information Estimation (MMIE) training, a word lattice is usually used as a compact representation of many different sentence hy...
The standard approach to speaker verification is to extract cepstral features from the speech spectrum and model them by generative or discriminative techniques. We propose a nov...
A distinction is usually made between wavelet bases and wavelet frames. The former are associated with a one-to-one representation of signals, which is somewhat constrained but mos...
Bag-of-visual Words (BoW) image representation is getting popular in computer vision and multimedia communities. However, experiments show that the traditional BoW representation ...