Existing data-stream clustering algorithms such as CluStream are based on k-means. These clustering algorithms are incompetent to find clusters of arbitrary shapes and cannot hand...
Clustering with constraints is an emerging area of data mining research. However, most work assumes that the constraints are given as one large batch. In this paper we explore the...
An essential part of an expert-finding task, such as matching reviewers to submitted papers, is the ability to model the expertise of a person based on documents. We evaluate seve...
Multinomial distributions over words are frequently used to model topics in text collections. A common, major challenge in applying all such topic models to any text mining proble...
We address the task of learning rankings of documents from search engine logs of user behavior. Previous work on this problem has relied on passively collected clickthrough data. ...