Sciweavers

KDD
2005
ACM

Determining an author's native language by mining a text for errors

14 years 5 months ago
Determining an author's native language by mining a text for errors
In this paper, we show that stylistic text features can be exploited to determine an anonymous author's native language with high accuracy. Specifically, we first use automatic tools to ascertain frequencies of various stylistic idiosyncrasies in a text. These frequencies then serve as features for support vector machines that learn to classify texts according to author native language. Categories and Subject Descriptors I.2.6 [Artificial Intelligence]: Learning ? Analogies, Concept learning, Connectionism and neural nets, Induction, Knowledge acquisition, Language acquisition, Parameter learning General Terms Algorithms, Measurement, Experimentation Keywords Text mining, author profiling
Moshe Koppel, Jonathan Schler, Kfir Zigdon
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2005
Where KDD
Authors Moshe Koppel, Jonathan Schler, Kfir Zigdon
Comments (0)