Large corpora are essential to modern methods of computational linguistics and natural language processing. In this paper, we describe an ongoing project whose aim is to build a l...
We introduce a method for image retrieval that leverages the implicit information about object importance conveyed by the list of keyword tags a person supplies for an image. We p...
This paper presents a framework for user-oriented text mining. It is then illustrated with an example of discovering knowledge from competitors’ websites. The knowledge to be di...
In this paper, we present a series of semantic analyses of words in political blogs in the setting of categorization of two opposite political orientations: liberal vs. conservati...
How to assign appropriate weights to terms is one of the critical issues in information retrieval. Many term weighting schemes are unsupervised. They are either based on the empir...