Sciweavers

KDD
2003
ACM

Style mining of electronic messages for multiple authorship discrimination: first results

14 years 4 months ago
Style mining of electronic messages for multiple authorship discrimination: first results
This paper considers the use of computational stylistics for performing authorship attribution of electronic messages, addressing categorization problems with as many as 20 different classes (authors). Effective stylistic characterization of text is potentially useful for a variety of tasks, as language style contains cues regarding the authorship, purpose, and mood of the text, all of which would be useful adjuncts to information retrieval or knowledge-management tasks. We focus here on the problem of determining the author of an anonymous message, based only on the message text. Several multiclass variants of the Winnow algorithm were applied to a vector representation of the message texts to learn models for discriminating different authors. We present results comparing the classification accuracy of the different approaches. The results show that stylistic models can be accurately learned to determine an author's identity. General Terms Mining text and semi-structured data, D...
Shlomo Argamon, Marin Saric, Sterling Stuart Stein
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2003
Where KDD
Authors Shlomo Argamon, Marin Saric, Sterling Stuart Stein
Comments (0)