Random perturbation is a promising technique for privacy preserving data mining. It retains an original sensitive value with a certain probability and replaces it with a random va...
We present a multi-dimensional indexing approach for fast sequence similarity search in DNA and protein databases. In particular, we propose effective transformations of subsequen...
In data stream applications, data arrive continuously and can only be scanned once as the query processor has very limited memory (relative to the size of the stream) to work with...
Nick Koudas, Beng Chin Ooi, Kian-Lee Tan, Rui Zhan...
This paper is an empirical investigation into the effectiveness of linear scaling adaptation for case-based software project effort prediction. We compare two variants of a linea...
Colin Kirsopp, Emilia Mendes, Rahul Premraj, Marti...
We present an Outlier Detection using Indegree Number (ODIN) algorithm that utilizes k-nearest neighbour graph. Improvements to existing kNN distance -based method are also propos...