We consider the problem of improving named entity recognition (NER) systems by using external dictionaries--more specifically, the problem of extending state-of-the-art NER system...
Overall performance of the data mining process depends not just on the value of the induced knowledge but also on various costs of the process itself such as the cost of acquiring...
Background: The large gap between the number of protein sequences in databases and the number of functionally characterized proteins calls for the development of a fast computatio...
Existing hierarchical summarization techniques fail to provide synopses good in terms of relative-error metrics. This paper introduces multiplicative synopses: a summarization par...
The problem of privacy-preserving data mining has been studied extensively in recent years because of the increased amount of personal information which is available to corporation...