WHIRL is an extensionof relational databasesthat canperform "soft joins" basedon the similarity of textual identifiers;thesesoftjoins extendthe traditional operationof j...
We illustrate that Web searches can often be utilized to generate background text for use with text classification. This is the case because there are frequently many pages on the...
In this paper we generalize the LARS feature selection method to the linear SVM model, derive an efficient algorithm for it, and empirically demonstrate its usefulness as a featur...
The integration of data produced and collected across autonomous, heterogeneous web services is an increasingly important and challenging problem. Due to the lack of global identi...
Luis Gravano, Panagiotis G. Ipeirotis, Nick Koudas...
Automatic text classification is an important operational problem in digital library practice. Most text classification efforts so far concentrated on developing centralized solut...