This paper establishes the theoretical framework of b-bit minwise hashing. The original minwise hashing method has become a standard technique for estimating set similarity (e.g.,...
We describe improvements to the use of semantic lexicons by a state-of-the-art query interpretation system powering a major search engine. We successfully compute concept label im...
We present the conceptual framework of the Social Honeypot Project for uncovering social spammers who target online communities and initial empirical results from Twitter and MySp...
We propose a novel method to detect cultural differences over the world automatically by using a large amount of geotagged images on the photo sharing Web sites such as Flickr. W...
Abstract—Recent progress in research fields such as Information Extraction and Information Retrieval enables the creation of systems providing better search experiences to web u...
Gianluca Demartini, Claudiu S. Firan, Mihai George...