This paper conducts experiments with three skewed data sets, seeking to demonstrate problems when skewed data is used, and identifying counter problems when data is balanced. The b...
Continuous speech input for ASR processing is usually presegmented into speech stretches by pauses. In this paper, we propose that smaller, prosodically defined units can be ident...
Yi-Fen Liu, Shu-Chuan Tseng, Jyh-Shing Roger Jang,...
Finding and removingoutliers is an important problem in data mining. Errors in large databases can be extremely common,so an important property of a data mining algorithm is robus...
Most decision tree induction methods used for extracting knowledge in classification problems are unable to deal with uncertainties embedded within the data, associated with human...
Mass spectrometry from clinical specimens is used in order to identify biomarkers in a diagnosis. Thus, a reliable method for both feature selection and classification is required...