Abstract. It is difficult to track, parse and model human-computer interactions during editing and revising of documents, but it is necessary if we are to develop automated technol...
Stochastic grammar has been used in many video analysis and event recognition applications as an efficient model to represent large-scale video activity. However, in previous works...
In this paper, we explore the idea of using only pose, without utilizing any temporal information, for human action recognition. In contrast to the other studies using complex acti...
In this paper we explore the idea of using high-level semantic concepts, also called attributes, to represent human actions from videos and argue that attributes enable the constr...
— In this work we introduce a novel approach for detecting spatiotemporal object-action relations, leading to both, action recognition and object categorization. Semantic scene g...
Eren Erdal Aksoy, Alexey Abramov, Florentin Wö...