Making Logistic Regression a Core Data Mining Tool with TR-IRLS

9 years 5 months ago
Making Logistic Regression a Core Data Mining Tool with TR-IRLS
Binary classification is a core data mining task. For large datasets or real-time applications, desirable classifiers are accurate, fast, and need no parameter tuning. We present a simple implementation of logistic regression that meets these requirements. A combination of regularization, truncated Newton methods, and iteratively re-weighted least squares make it faster and more accurate than modern SVM implementations, and relatively insensitive to parameters. It is robust to linear dependencies and some scaling problems, making most data preprocessing unnecessary. 1 Motivation and Terminology This article is motivated by the success of a fast, simple logistic regression (LR) algorithm in several highdimensional data mining engagements, including life sciences data mining [10, 7], threat classification and temporal link analysis [16], collaborative filtering [11], and text processing [7]. The rise of support vector machines (SVMs) for binary classification has renewed interest i...
Paul Komarek, Andrew W. Moore
Added 24 Jun 2010
Updated 24 Jun 2010
Type Conference
Year 2005
Where ICDM
Authors Paul Komarek, Andrew W. Moore
Comments (0)