A bipartite query-URL graph, where an edge indicates that a document was clicked for a query, is a useful construct for finding groups of related queries and URLs. Here we use thi...
Sentiment classification refers to the task of automatically identifying whether a given piece of text expresses positive or negative opinion towards a subject at hand. The prolif...
Topic models provide a powerful tool for analyzing large text collections by representing high dimensional data in a low dimensional subspace. Fitting a topic model given a set of...
Authorship attribution is the task of identifying the author of a given text. The main concern of this task is to define an appropriate characterization of documents that captures ...
In this study, we describe our system at the Intellectual Property track of the 2009 CrossLanguage Evaluation Forum campaign (CLEF-IP). The CLEF-IP track addressed prior art searc...