Document classification presents difficult challenges due to the sparsity and the high dimensionality of text data, and to the complex semantics of the natural language. The tradi...
The question of how to publish an anonymized search log was brought to the forefront by a well-intentioned, but privacy-unaware AOL search log release. Since then a series of ad-h...
Aleksandra Korolova, Krishnaram Kenthapadi, Nina M...
The output of a data mining algorithm is only as good as its inputs, and individuals are often unwilling to provide accurate data about sensitive topics such as medical history an...
Discriminative subgraphs are widely used to define the feature space for graph classification in large graph databases. Several scalable approaches have been proposed to mine disc...
The BioJournalMonitor is a decision support system for the analysis of trends and topics in the biomedical literature. Its main goal is to identify potential diagnostic and therap...