Many scalable data mining tasks rely on active learning to provide the most useful accurately labeled instances. However, what if there are multiple labeling sources (`oracles...
We study the problem of correlating micro-blogging activity with stock-market events, defined as changes in the price and traded volume of stocks. Specifically, we collect messa...
Eduardo J. Ruiz, Vagelis Hristidis, Carlos Castill...
We present algorithms for fast quantile and frequency estimation in large data streams using graphics processor units (GPUs). We exploit the high computational power and memory ba...
Naga K. Govindaraju, Nikunj Raghuvanshi, Dinesh Ma...
We present an unusual algorithm involving classification trees-CARTwheels--where two trees are grown in opposite directions so that they are joined at their leaves. This approach ...
Document classification is a key task for many text mining applications. However, traditional text classification requires labeled data to construct reliable and accurate classifie...