Sciweavers

ASUNAM
2010
IEEE

Semi-Supervised Classification of Network Data Using Very Few Labels

13 years 6 months ago
Semi-Supervised Classification of Network Data Using Very Few Labels
The goal of semi-supervised learning (SSL) methods is to reduce the amount of labeled training data required by learning from both labeled and unlabeled instances. Macskassy and Provost [1] proposed the weighted-vote relational neighbor classifier (wvRN) as a simple yet effective baseline for semi-supervised learning on network data. It is similar to many recent graph-based SSL methods (e.g., [2], [3]) and is shown to be essentially the same as the Gaussian-field classifier proposed by Zhu et al. [4] and proves to be very effective on some benchmark network datasets. We describe another simple and intuitive semi-supervised learning method based on random graph walk that outperforms wvRN by a large margin on several benchmark datasets when very few labels are available. Additionally, we show that using authoritative instances as training seeds -- instances that arguably cost much less to label -- dramatically reduces the amount of labeled data required to achieve the same classification...
Frank Lin, William W. Cohen
Added 26 Oct 2010
Updated 26 Oct 2010
Type Conference
Year 2010
Where ASUNAM
Authors Frank Lin, William W. Cohen
Comments (0)