In the standard formalization of supervised learning problems, a datum is represented as a vector of features without prior knowledge about relationships among features. However, ...
Without the proliferation of formal semantic annotations, the Semantic Web is certainly doomed to failure. In earlier work we presented a new paradigm to avoid this: the 'Sel...
Addressed in this paper is the issue of `email data cleaning' for text mining. Many text mining applications need take emails as input. Email data is usually noisy and thus i...
XML database systems emerge as a result of the acceptance of the XML data model. Recent works have followed the promising approach of building XML database management systems on un...
We propose a programming paradigm called compress-and-conquer (CC) that leads to optimal performance on multicore platforms. Given a multicore system of p cores and a problem of s...