Word Sense Induction (WSI) is the task of identifying the different senses (uses) of a target word in a given text. Traditional graph-based approaches create and then cluster a gra...
We present a large-scale analysis of the content of weblogs dating back to the release of the Blogger program in 1999. Over one million blogs were analyzed from their conception t...
This paper describes a system for efficient indexing and retrieval of words in collections of document images. The proposed method is based on two main principles: unsupervised pr...
Traditional clustering is a descriptive task that seeks to identify homogeneous groups of objects based on the values of their attributes. While domain knowledge is always the bes...
Topic description is as important as topic detection. In this paper, we propose a novel method to describe Web topics with topic words. Under the assumption that representative wo...