In this paper, we present an algorithm that can classify large-scale text data with high classification quality and fast training speed. Our method is based on a novel extension o...
Dong Zhuang, Benyu Zhang, Qiang Yang, Jun Yan, Zhe...
Text classification poses some specific challenges. One such challenge is its high dimensionality where each document (data point) contains only a small subset of them. In this pap...
The Knowledge Discovery Toolbox (KDT) enables domain experts to perform complex analyses of huge datasets on supercomputers using a high-level language without grappling with the ...
A mapping of unit vectors onto a 5D hypersphere is used to model and partition ODFs from HARDI data. This mapping has a number of useful and interesting properties and we make a li...
The key problem to be faced when building a HMM-based continuous speech recogniser is maintaining the balance between model complexity and available training data. For large vocab...