Sciweavers

TAL
2010
Springer
13 years 2 months ago
Summarization as Feature Selection for Document Categorization on Small Datasets
Abstract. Most common feature selection techniques for document categorization are supervised and require lots of training data in order to accurately capture the descriptive and d...
Emmanuel Anguiano-Hernández, Luis Villase&n...
ML
2000
ACM
13 years 4 months ago
Randomizing Outputs to Increase Prediction Accuracy
Bagging and boosting reduce error by changing both the inputs and outputs to form perturbed training sets, grow predictors on these perturbed training sets and combine them. A que...
Leo Breiman
COLING
2000
13 years 5 months ago
Word Sense Disambiguation of Adjectives Using Probabilistic Networks
In this paper, word sense dismnbiguation (WSD) accuracy achievable by a probabilistic classifier, using very milfimal training sets, is investigated. \Ve made the assuml)tiou that...
Gerald Chao, Michael G. Dyer
BMVC
2001
13 years 6 months ago
An Information Theoretic Approach to Statistical Shape Modelling
Statistical shape models have been used widely as a basis for segmenting and interpreting images. A major drawback of the approach is the need to establish a set of dense correspo...
Rhodri H. Davies, Timothy F. Cootes, Carole J. Twi...
DRR
2010
13 years 6 months ago
Time and space optimization of document content classifiers
Scaling up document-image classifiers to handle an unlimited variety of document and image types poses serious challenges to conventional trainable classifier technologies. Highly...
Dawei Yin, Henry S. Baird, Chang An
CIKM
2006
Springer
13 years 8 months ago
Performance thresholding in practical text classification
In practical classification, there is often a mix of learnable and unlearnable classes and only a classifier above a minimum performance threshold can be deployed. This problem is...
Hinrich Schütze, Emre Velipasaoglu, Jan O. Pe...
ICDAR
2005
IEEE
13 years 10 months ago
Enhancing Training Data for Handwriting Recognition of Whiteboard Notes with Samples from a Different Database
Recognition of unconstrained handwritten text is still a challenge. In this paper we consider a new problem, which is the recognition of notes written on a whiteboard. Our recogni...
Marcus Liwicki, Horst Bunke
MICAI
2007
Springer
13 years 10 months ago
Taking Advantage of the Web for Text Classification with Imbalanced Classes
A problem of supervised approaches for text classification is that they commonly require high-quality training data to construct an accurate classifier. Unfortunately, in many real...
Rafael Guzmán-Cabrera, Manuel Montes-y-G&oa...
PKDD
2009
Springer
118views Data Mining» more  PKDD 2009»
13 years 11 months ago
Sparse Kernel SVMs via Cutting-Plane Training
We explore an algorithm for training SVMs with Kernels that can represent the learned rule using arbitrary basis vectors, not just the support vectors (SVs) from the training set. ...
Thorsten Joachims, Chun-Nam John Yu
MLDM
2009
Springer
13 years 11 months ago
Improved Comprehensibility and Reliability of Explanations via Restricted Halfspace Discretization
A number of two-class classification methods first discretize each attribute of two given training sets and then construct a propositional DNF formula that evaluates to True for ...
Klaus Truemper