We present Image2Emoji, a multi-modal approach for generating emoji labels for an image in a zero-shot manner. Different from existing zero-shot image-to-text approaches, we expl...
Spencer Cappallo, Thomas Mensink, Cees G. M. Snoek
This paper considers the problem of providing users playing one streaming video the option of instantaneous and seamless playback of alternative videos. Recommendation systems can...
This paper aims at robust and efficient pose tracking for augmented reality on modern smartphones. Existing methods, relying on either vision analysis or motion sensing, are eithe...
This paper presents OmniViewer, a multi-modal 3D video streaming system based on Dynamic Adaptive Streaming over HTTP (DASH) standard. OmniViewer allows users to view arbitrary si...
The tradeoff relationship between resource requirement, content complexity, and user satisfaction is magnified when more and more modern 3D Tele-immersive (3DTI) applications with...