Multiple data sources containing different types of features may be available for a given task. For instance, users’ profiles can be used to build recommendation systems. In a...
Abstract. The recognition of events in videos is a relevant and challenging task of automatic semantic video analysis. At present one of the most successful frameworks, used for ob...
Lamberto Ballan, Marco Bertini, Alberto Del Bimbo,...
Many dramatizations have depicted a fully automated home living environment, where actions and events are understood or even anticipated. While the realization of such environment...
We present a novel dataset and novel algorithms for the problem of detecting activities of daily living (ADL) in firstperson camera views. We have collected a dataset of 1 millio...
Spoken language is one of the most intuitive forms of interaction between humans and agents. Unfortunately, agents that interact with people using natural language often experienc...