The Broadcast News Editor (BNE) and Broadcast News Navigator (BNN) are fully implemented systems that exploit integrated image, speech, and language processing to support intellig...
Images and videos can be indexed by multiple features at different levels, such as color, texture, motion, and text annotation. Organizing this information into a system so that u...
In realizing video retrieval system, the crucial point is how to provide an effective access method of video contents. This paper focuses on Japanese cooking instruction utterance...
Grounded language models represent the relationship between words and the non-linguistic context in which they are said. This paper describes how they are learned from large corpo...
With the fast growing speech technologies, the world is emerging to a new speech era. Speech recognition has now become a practical technology for real world applications. While so...