Sciweavers

483 search results - page 2 / 97
» Sampling the Web as Training Data for Text Classification
Sort
View
KDD
2002
ACM
179views Data Mining» more  KDD 2002»
14 years 5 months ago
Combining clustering and co-training to enhance text classification using unlabelled data
In this paper, we present a new co-training strategy that makes use of unlabelled data. It trains two predictors in parallel, with each predictor labelling the unlabelled data for...
Bhavani Raskutti, Herman L. Ferrá, Adam Kow...
ECAI
2006
Springer
13 years 9 months ago
Text Sampling and Re-Sampling for Imbalanced Authorship Identification Cases
Authorship identification can be seen as a single-label multi-class text categorization problem. Very often, there are extremely few training texts at least for some of the candida...
Efstathios Stamatatos
WCE
2007
13 years 6 months ago
A Comparison of Classification Techniques for Technical Text Passages
— Our work explores the use of several text categorization techniques for classification of manufacturing quality defect and service shop data sets into fixed categories. Althoug...
Mark M. Kornfein, Helena Goldfarb
ICDAR
2005
IEEE
13 years 11 months ago
Enhancing Training Data for Handwriting Recognition of Whiteboard Notes with Samples from a Different Database
Recognition of unconstrained handwritten text is still a challenge. In this paper we consider a new problem, which is the recognition of notes written on a whiteboard. Our recogni...
Marcus Liwicki, Horst Bunke
IJCAI
2003
13 years 6 months ago
Integrating Background Knowledge Into Text Classification
We present a description of three different algorithms that use background knowledge to improve text classifiers. One uses the background knowledge as an index into the set of tra...
Sarah Zelikovitz, Haym Hirsh