In this lecture, we will see a quick review of supervised learning. In particular, we will see how binary classifiers work and how they can be trained within the loss minimization framework. We will also go over support vector machines and logistic regression from this perspective.
Lecture slides and Readings
-
Trevor Hastie, Robert Tibshirani, Jerome Friedman, The Elements of Statistical Learning, Chapter 4.
-
Chris Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 1998.
-
Chih-Jen Lin. Optimization, Support Vector Machines, and Machine Learning (slides)
Further reading
-
Dan Roth, Course notes on online learning of linear functions.
-
Grzegorz Chrupala and Nicolas Stroppa. Linear models for classification (slides)
-
Hal Daumé III. A Course in Machine Learning, Chapter 8: The Perceptron
-
L. El Ghaoui. Optimization Models and Applications: Linear Binary Classification
-
Andrew Ng and Michael I. Jordan, On Discriminative vs. Generative Classifiers. A comparison of Logistics Regression and naive Bayes. NIPS 2002.
-
Peter L. Bartlett, Michael I. Jordan, and Jon D. Mcauliffe, Convexity, Classification, and Risk Bounds, Journal of the American Statistical Association, 2006.
-
Léon Bottou and Olivier Bousquet, The Tradeoffs of Large Scale Learning, NIPS 2008
-
Léon Bottou and Chih-Jen Lin, Support Vector Machine Solvers, Large Scale Kernel Machines, 2007
Convex optimization resources
-
Stephen Boyd and Lieven Vandenberghe, Convex Optimization (This book covers convex optimization in general and is a very useful resource for understanding learning algorithms that we see)
-
John Duchi’s slides on Introduction to Convex Optimization for Machine Learning