This paper describes a text-to-audiovisual speech synthesizer system incorporating the head and eye movements. The face is modeled using a set of images of a human subject. Visemes...
Many visual search and matching systems represent images using sparse sets of "visual words": descriptors that have been quantized by assignment to the best-matching symb...
With the rapid development in graphics hardware and volume rendering techniques, many volumetric datasets can now be rendered in real time on a standard PC equipped with a commodi...
Many contemporary language technology systems are characterized by long pipelines of tools with complex dependencies. Too often, these workflows are implemented by ad hoc scripts;...
In this paper we present a prototype system to enrich audiovisual contents with annotations, which exploits existing technologies for automatic extraction of metadata (such as OCR...
Giuseppe Amato, Paolo Bolettieri, Franca Debole, F...