Sciweavers

483 search results - page 24 / 97
» Sampling the Web as Training Data for Text Classification
Sort
View
ICMLA
2007
15 years 4 months ago
Semi-Supervised Active Learning for Modeling Medical Concepts from Free Text
We apply a new active learning formulation to the problem of learning medical concepts from unstructured text. The new formulation is based on maximizing the mutual information th...
Rómer Rosales, Praveen Krishnamurthy, R. Bh...
ICASSP
2009
IEEE
15 years 9 months ago
Filtering web text to match target genres
In language modeling for speech recognition, both the amount of training data and the match to the target task impact the goodness of the model, with the trade-off usually favorin...
Marius A. Marin, Sergey Feldman, Mari Ostendorf, M...
COMPSAC
2005
IEEE
15 years 8 months ago
Recovering "Lack of Words" in Text Categorization for Item Banks
PKIP, Patterned Keywords in Phrase, is our feature selection approach to text categorization (TC) for item banks. An item bank is a collection of textual data in which each item c...
Atorn Nuntiyagul, Nick Cercone, Kanlaya Naruedomku...
ICFCA
2009
Springer
15 years 24 days ago
A Concept Lattice-Based Kernel for SVM Text Classification
Abstract. Standard Support Vector Machines (SVM) text classification relies on bag-of-words kernel to express the similarity between documents. We show that a document lattice can ...
Claudio Carpineto, Carla Michini, Raffaele Nicolus...
AIPRF
2007
15 years 4 months ago
Evaluation of Different Approaches to Training a Genre Classifier
This paper presents experiments on classifying web pages by genre. Firstly, a corpus of 1539 manually labeled web pages was prepared. Secondly, 502 genre features were selected ba...
Vedrana Vidulin, Mitja Lustrek, Matjaz Gams