During the last years, the use of string kernels that compare documents has been shown to achieve good results on text classification problems. In this paper we introduce the appl...
Text classification in the medical domain is a real world problem with wide applicability. This paper investigates extensively the effect of text representation approaches on the p...
Fathi H. Saad, Beatriz de la Iglesia, Duncan G. Be...
We propose a novel approach for categorizing text documents based on the use of a special kernel. The kernel is an inner product in the feature space generated by all subsequences...
Huma Lodhi, John Shawe-Taylor, Nello Cristianini, ...
WHIRL is an extensionof relational databasesthat canperform "soft joins" basedon the similarity of textual identifiers;thesesoftjoins extendthe traditional operationof j...
In this paper, we introduce a method for categorizing digital items according to their topic, only relying on the document's metadata, such as author name and title informati...