Movies segmentation into semantically correlated units is a quite tedious task due to ”semantic gap”. Low-level features do not provide useful information about the semantical...
In this paper, we propose a novel scene categorization method based on contextual visual words. In this method, we extend the traditional ‘bags of visual words’ model by introd...
System-level computer architecture simulations create large volumes of simulation data to explore alternative architectural solutions. Interpreting and drawing conclusions from thi...
Audio-visual speaker diarisation is the task of estimating “who spoke when” using audio and visual cues. In this paper we propose the combination of an audio diarisation syste...
The following paper presents a novel audio-visual approach for unsupervised speaker locationing. Using recordings from a single, low-resolution room overview camera and a single f...