This paper presents the Topic-Aspect Model (TAM), a Bayesian mixture model which jointly discovers topics and aspects. We broadly define an aspect of a document as a characteristi...
Abstract Clustering text data streams is an important issue in data mining community and has a number of applications such as news group filtering, text crawling, document organiza...
Uncertainty in categorical data is commonplace in many applications, including data cleaning, database integration, and biological annotation. In such domains, the correct value o...
Sarvjeet Singh, Chris Mayfield, Sunil Prabhakar, R...
In this paper we present tools that provide an easy way to edit XML content directly on the web, with the usual benefit of valid XML content. These tools make it possible to crea...
Entity linkage is central to almost every data integration and data cleaning scenario. Traditional techniques use some computed similarity among data structure to perform merges a...
Ekaterini Ioannou, Wolfgang Nejdl, Claudia Nieder&...