Data cleaning is the process of correcting anomalies in a data source, that may for instance be due to typographical errors, or duplicate representations of an entity. It is a cruc...
The aim of this short paper is to present a general method of using background knowledge to impose constraints in conceptual clustering of object-attribute relational data. The pr...
Increasingly large text datasets and the high dimensionality associated with natural language create a great challenge in text mining. In this research, a systematic study is cond...
M. Mahdi Shafiei, Singer Wang, Roger Zhang, Evange...
This paper addresses the issue of Web document summarization. As textual content of Web documents is often scarce or irrelevant and existing summarization techniques are based on ...