Sciweavers

WWW
2009
ACM

Link based small sample learning for web spam detection

14 years 5 months ago
Link based small sample learning for web spam detection
Robust statistical learning based web spam detection system often requires large amounts of labeled training data. However, labeled samples are more difficult, expensive and time consuming to obtain than unlabeled ones. This paper proposed link based semi-supervised learning algorithms to boost the performance of a classifier, which integrates the traditional Self-training with the topological dependency based link learning. The experiments with a few labeled samples on standard WEBSPAM-UK2006 benchmark showed that the algorithms are effective. Categories and Subject Descriptors H.5.4 [Information Interfaces and Presentation]: Hypertext/Hypermedia; K.4.m [Computer and Society]: Miscellaneous; H.4.m [Information Systems]: Miscellaneous General Terms Measurement, Experimentation, Algorithms Keywords Link spam, Content spam, Web spam, Machine learning
Guanggang Geng, Qiudan Li, Xinchang Zhang
Added 21 Nov 2009
Updated 21 Nov 2009
Type Conference
Year 2009
Where WWW
Authors Guanggang Geng, Qiudan Li, Xinchang Zhang
Comments (0)