The aim of query-based sampling is to obtain a sufficient, representative sample of an underlying (text) collection. Current measures for assessing sample quality are too coarse gr...
Ambiguity of entity mentions and concept references is a challenge to mining text beyond surface-level keywords. We describe an effective method of disambiguating surface forms an...
Yiping Zhou, Lan Nie, Omid Rouhani-Kalleh, Flavian...
Statistical topic models such as the Latent Dirichlet Allocation (LDA) have emerged as an attractive framework to model, visualize and summarize large document collections in a co...
Ramesh Nallapati, Amr Ahmed, William W. Cohen, Eri...
In this paper, a Random Field Topic (RFT) model is proposed for semantic region analysis from motions of objects in crowded scenes. Different from existing approaches of learning ...
Segmentation of document images remains a challenging vision problem. Although document images have a structured layout, capturing enough of it for segmentation can be difficult....