Detection of near duplicate documents is an important problem in many data mining and information filtering applications. When faced with massive quantities of data, traditional d...
Aleksander Kolcz, Abdur Chowdhury, Joshua Alspecto...
Cyber attacks against networked computers have become relentless in recent years. The most common attack method is to exploit memory corruption vulnerabilities such as buffer ove...
Jun Xu, Peng Ning, Chongkyung Kil, Yan Zhai, Chris...
In recent years, random projection has been used as a valuable tool for performing dimensionality reduction of high dimensional data. Starting with the seminal work of Johnson and...
We address privacy-preserving classification problem in a distributed system. Randomization has been the approach proposed to preserve privacy in such scenario. However, this appr...
How closely related are two nodes in a graph? How to compute this score quickly, on huge, disk-resident, real graphs? Random walk with restart (RWR) provides a good relevance scor...