We propose a semi-supervised model which segments and annotates images using very few labeled images and a large unaligned text corpus to relate image regions to text labels. Give...
In real-world applications, “what you saw” during training is often not “what you get” during deployment: the distribution and even the type and dimensionality of features...
Integration of goal-driven, top-down attention and image-driven, bottom-up attention is crucial for visual search. Yet, previous research has mostly focused on models that are pur...
Abstract. Document decomposition is a basic but crucial step for many document related applications. This paper proposes a novel approach to decompose document images into zones. I...
One of the assumptions of current software for visualizing architecture is that the underlying geometry is a correct, objective, and complete representation of the objects in ques...