Abstract--Imbalanced data sets present a particular challenge to the data mining community. Often, it is the rare event that is of interest and the cost of misclassifying the rare ...
Given a database with missing or uncertain content, our goal is to correct and fill the database by extracting specific information from a large corpus such as the Web, and to d...
The firehose of data generated by users on social networking and microblogging sites such as Facebook and Twitter is enormous. Real-time analytics on such data is challenging wit...
When attacking the German Enigma cipher machine during the 1930s, the Polish mathematician Marian Rejewski developed a catalog of disjoint cycles of permutations generated by Enigm...
The graph classification problem is learning to classify separate, individual graphs in a graph database into two or more categories. A number of algorithms have been introduced fo...
Nikhil S. Ketkar, Lawrence B. Holder, Diane J. Coo...