Scalable similarity search is the core of many large scale learning or data mining applications. Recently, many research results demonstrate that one promising approach is creatin...
The rapid growth of the Internet over the last decade has been startling. However, efforts to track its growth have often fallen afoul of bad data -- for instance, how much traffi...
There has historically been very little concern with extrapolation in Machine Learning, yet extrapolation can be critical to diagnose. Predictor functions are almost always learne...
The primary goal of Web usage mining is the discovery of patterns in the navigational behavior of Web users. Standard approaches, such as clustering of user sessions and discoveri...
Co-clustering has emerged as an important technique for mining contingency data matrices. However, almost all existing coclustering algorithms are hard partitioning, assigning each...