With massive book digitization efforts underway, there is a need for developing effective book retrieval strategies. This paper explores the relative contribution of different par...
Anchor text has been considered as a useful resource to complement the representation of target pages and is broadly used in web search. However, previous research only uses anchor...
In this paper, we study search bot traffic from search engine query logs at a large scale. Although bots that generate search traffic aggressively can be easily detected, a large ...
Most classification algorithms are best at categorizing the Web documents into a few categories, such as the top two levels in the Open Directory Project. Such a classification me...
Though both quantity and quality of semantic concept detection in video are continuously improving, it still remains unclear how to exploit these detected concepts as semantic ind...