In recent years, many algorithms for the Web have been developed that work with information units distinct from individual web pages. These include segments of web pages or aggreg...
We propose a novel hashing scheme for image retrieval,
clustering and automatic object discovery. Unlike commonly
used bag-of-words approaches, the spatial extent of
image featu...
Ondrej Chum (Czech Technical University), Michal P...
We present a new statistical compression method, which we call Phrase Based Dense Code (PBDC), aimed at compressing large digital libraries. PBDC compresses the text collection to ...
Abstract. Automated Text Categorization has reached the levels of accuracy of human experts. Provided that enough training data is available, it is possible to learn accurate autom...
Community Question Answering (CQA) has emerged as a popular forum for users to pose questions for other users to answer. Over the last few years, CQA portals such as Naver and Yah...