In this paper, we study search bot traffic from search engine query logs at a large scale. Although bots that generate search traffic aggressively can be easily detected, a large ...
We introduce a robust and efficient framework called CLUMP (CLustering Using Multiple Prototypes) for unsupervised discovery of structure in data. CLUMP relies on finding multip...
In this paper, we formulate the problem of summarization of a dataset of transactions with categorical attributes as an optimization problem involving two objective functions - co...
In the wrapperapproachto feature subset selection, a searchfor an optimalset of features is madeusingthe induction algorithm as a black box. Theestimated future performanceof the ...
With the wide deployment of smart card automated fare collection (SCAFC) systems, public transit agencies have been benefiting from huge volume of transit data, a kind of sequent...
Rui Chen, Benjamin C. M. Fung, Bipin C. Desai, N&e...