Sciweavers

483 search results - page 11 / 97
» Sampling the Web as Training Data for Text Classification
Sort
View
ICDM
2010
IEEE
147views Data Mining» more  ICDM 2010»
14 years 9 months ago
Location and Scatter Matching for Dataset Shift in Text Mining
Dataset shift from the training data in a source domain to the data in a target domain poses a great challenge for many statistical learning methods. Most algorithms can be viewed ...
Bo Chen, Wai Lam, Ivor W. Tsang, Tak-Lam Wong
ACL
2009
14 years 9 months ago
Co-Training for Cross-Lingual Sentiment Classification
The lack of Chinese sentiment corpora limits the research progress on Chinese sentiment classification. However, there are many freely available English sentiment corpora on the W...
Xiaojun Wan
KDD
2008
ACM
178views Data Mining» more  KDD 2008»
16 years 3 days ago
Training structural svms with kernels using sampled cuts
Discriminative training for structured outputs has found increasing applications in areas such as natural language processing, bioinformatics, information retrieval, and computer ...
Chun-Nam John Yu, Thorsten Joachims
LREC
2010
176views Education» more  LREC 2010»
15 years 1 months ago
There's no Data like More Data? Revisiting the Impact of Data Size on a Classification Task
In the paper we investigate the impact of data size on a Word Sense Disambiguation task (WSD). We question the assumption that the knowledge acquisition bottleneck, which is known...
Ines Rehbein, Josef Ruppenhofer
CIKM
2009
Springer
15 years 3 months ago
A co-classification framework for detecting web spam and spammers in social media web sites
Social media are becoming increasingly popular and have attracted considerable attention from spammers. Using a sample of more than ninety thousand known spam Web sites, we found ...
Feilong Chen, Pang-Ning Tan, Anil K. Jain