Many time series data mining problems require subsequence similarity search as a subroutine. While this can be performed with any distance measure, and dozens of distance measures ...
Doruk Sart, Abdullah Mueen, Walid A. Najjar, Eamon...
In the last years there has been a huge growth and consolidation of the Data Mining field. Some efforts are being done that seek the establishment of standards in the area. Includ...
In this paper, we address the problem of mining transliterations of Named Entities (NEs) from large comparable corpora. We leverage the empirical fact that multilingual news artic...
Raghavendra Udupa, K. Saravanan, A. Kumaran, Jagad...
In semi-supervised classification boosting, a similarity measure is demanded in order to measure the distance between samples (both labeled and unlabeled). However, most of the e...
Roget’s Thesaurus has not been sufficiently appreciated in Natural Language Processing. We show that Roget's and WordNet are birds of a feather. In a few typical tests, we ...