We present an algorithm for video retrieval that fuses the decisions of multiple retrieval agents in both text and image modalities. While the normalization and combination of evi...
As the consequence of semantic gap, visual similarity does not guarantee semantic similarity, which in general is conflicting with the inherent assumption of many generativebased ...
Videotext refers to text superimposed on video frames. A videotext based Multimedia Description Scheme has recently been adopted into the MPEG-7 standard. A study of published wor...
The multimodal nature of speech is often ignored in human-computer interaction, but lip deformations and other body motion, such as those of the head, convey additional information...
Iain Matthews, Timothy F. Cootes, J. Andrew Bangha...
Image classification or annotation is proved difficult for the computer algorithms. The Naive-Bayes Nearest Neighbor method is proposed to tackle the problem, and achieved the stat...