Automatic Language Identification (LID) in music has received significantly less attention than LID in speech. Here, we study the problem of LID in music videos uploaded on YouT...
Vijay Chandrasekhar, Mehmet Emre Sargin, David A. ...
Recently great interest has been shown in the visual surveillance of public transportation systems. The challenge is the automated analysis of passenger’s behaviors with a set o...
Annotations of multimedia documents typically have been pursued in two different directions. Either previous approaches have focused on low level descriptors, such as dominant colo...
Many successful models for predicting attention in a scene involve three main steps: convolution with a set of filters, a center-surround mechanism and spatial pooling to constru...
Naila Murray, Maria Vanrell, Xavier Otazu, C. Alej...
Conventional Difference of Gaussian (DOG) filter is usually used to model the early stage of visual processing. However, convolution operation used with DOG does not explicitly a...