The rapid globalization of Wikipedia is generating a parallel, multi-lingual corpus of unprecedented scale. Pages for the same topic in many different languages emerge both as a r...
LBR is a highly accurate classification algorithm, which lazily constructs a single Bayesian rule for each test instance at classification time. However, its computational complex...
Automatic protocol reverse engineering has recently received significant attention due to its importance to many security applications. However, previous methods are all limited in...
This work examines under what conditions compression methodologies can retain the outcome of clustering operations. We focus on the popular k-Means clustering algorithm and we dem...
Deepak S. Turaga, Michail Vlachos, Olivier Versche...
Discovering a representative set of theme patterns from a large amount of text for interpreting their meaning has always been concerned by researches of both data mining and inform...
Yongxin Tong, Shilong Ma, Dan Yu, Yuanyuan Zhang, ...