As online document collections continue to expand, both on the Web and in proprietary environments, the need for duplicate detection becomes more critical. Few users wish to retri...
As the size and dimensionality of data sets increase, the task of feature selection has become increasingly important. In this paper we demonstrate how association rules can be us...
Abstract. Traditionally, distinguishing between high quality professional photos and low quality amateurish photos is a human task. To automatically assess the quality of a photo t...
Large databases with uncertain information are becoming more common in many applications including data integration, location tracking, and Web search. In these applications, ranki...
A wealth of information is available only in web pages, patents, publications etc. Extracting information from such sources is challenging, both due to the typically complex langu...