The Web is increasingly becoming an important channel for conducting businesses, disseminating information, and communicating with people on a global scale. More and more companie...
This work introduces a new family of link-based dissimilarity measures between nodes of a weighted directed graph. This measure, called the randomized shortest-path (RSP) dissimil...
Luh Yen, Marco Saerens, Amin Mantrach, Masashi Shi...
Addressed in this paper is the issue of `email data cleaning' for text mining. Many text mining applications need take emails as input. Email data is usually noisy and thus i...
We consider the problem of selecting an optimal set of sensors, as determined, for example, by the predictive accuracy of the resulting sensor network. Given an underlying metric ...
Roman Garnett, Michael A. Osborne, Stephen J. Robe...
Traditional relation extraction methods require pre-specified relations and relation-specific human-tagged examples. Bootstrapping systems significantly reduce the number of tr...
Jun Zhu, Zaiqing Nie, Xiaojiang Liu, Bo Zhang, Ji-...