Sciweavers

CLEF
2011
Springer

Author Identification Using Semi-supervised Learning - Notebook for PAN at CLEF 2011

12 years 4 months ago
Author Identification Using Semi-supervised Learning - Notebook for PAN at CLEF 2011
Author identification models fall into two major categories according to the way they handle the training texts: profile-based models produce one representation per author while instance-based models produce one representation per text. In this paper, we propose an approach that combines two well-known representatives of these categories, namely the Common nGrams method and a Support Vector Machine classifier based on character ngrams. The outputs of these classifiers are combined to enrich the training set with additional documents in a repetitive semi-supervised procedure inspired by the co-training algorithm. The evaluation results on closed-set author identification are encouraging, especially when the set of candidate authors is large.
Ioannis Kourtis, Efstathios Stamatatos
Added 18 Dec 2011
Updated 18 Dec 2011
Type Journal
Year 2011
Where CLEF
Authors Ioannis Kourtis, Efstathios Stamatatos
Comments (0)