Sciweavers

483 search results - page 27 / 97
» Sampling the Web as Training Data for Text Classification
Sort
View
AAAI
2008
15 years 5 months ago
Text Categorization with Knowledge Transfer from Heterogeneous Data Sources
Multi-category classification of short dialogues is a common task performed by humans. When assigning a question to an expert, a customer service operator tries to classify the cu...
Rakesh Gupta, Lev-Arie Ratinov
WWW
2009
ACM
16 years 3 months ago
Link based small sample learning for web spam detection
Robust statistical learning based web spam detection system often requires large amounts of labeled training data. However, labeled samples are more difficult, expensive and time ...
Guanggang Geng, Qiudan Li, Xinchang Zhang
ICML
2003
IEEE
16 years 3 months ago
Learning on the Test Data: Leveraging Unseen Features
This paper addresses the problem of classification in situations where the data distribution is not homogeneous: Data instances might come from different locations or times, and t...
Benjamin Taskar, Ming Fai Wong, Daphne Koller
SDM
2008
SIAM
122views Data Mining» more  SDM 2008»
15 years 4 months ago
Type-Independent Correction of Sample Selection Bias via Structural Discovery and Re-balancing
Sample selection bias is a common problem in many real world applications, where training data are obtained under realistic constraints that make them follow a different distribut...
Jiangtao Ren, Xiaoxiao Shi, Wei Fan, Philip S. Yu
ICDM
2003
IEEE
126views Data Mining» more  ICDM 2003»
15 years 8 months ago
Mining Relevant Text from Unlabelled Documents
Automatic classification of documents is an important area of research with many applications in the fields of document searching, forensics and others. Methods to perform class...
Daniel Barbará, Carlotta Domeniconi, Ning K...