Machine Learning Theory (Spring 2024)

Machine Learning Theory: Spring 2024

An introduction to machine learning theory and doing data analysis informed by it.

QTM 490R with David A. Hirshberg

Description

When we teach statistics, we often act as if simple models were right. For example, we might find the line that best fits the data, then act as if that line were a useful description of reality. With the right blend of intuition and luck, this can work --- we can get by with a model that is wrong but not too wrong --- but it's often helpful to have a more nuanced perspective. This class is an introduction to the theory and practice of doing statistics without simplistic modeling assumptions. In short, we will talk about how to answer questions about the world, like whether a treatment helps patients or a policy has the intended effect, by fitting flexible models to data.

Along the way, we'll get in some practice using and generalizing some mathematical tools you've probably seen in calculus, linear algebra, and probability classes. If you're considering graduate school in a discipline that involves quantitative work, that should be useful preparation.

In the beginning of the semester, we'll talk about a few nonparametric regression methods and the criteria we use to evaluate nonparametric regression methods generally. Then we'll shift our focus to theoretical tools that can help us understand how they behave. For the majority of the semester, we'll focus on regression with one covariate. This is atypical in modern data analysis, but it makes things easier to understand, as it means we can visualize our data and the curves we're fitting to it easily. But we'll use concepts relevant to higher-dimensional problems, and the understanding we develop this way will pay off late in the semester, when we'll find ourselves prepared for a sophisticated discussion of the essential challenge of working with big data: the curse of dimensionality.

We will cover a set supervised machine learning tasks including regression, classification, and model selection, using ℓ₁-regularized models, shape-constrained models, kernel methods, and more. And we will talk briefly about the modern way to use them to answer questions. If you have heard of augmented inverse propensity weighting (AIPW) or double machine learning (DML), that's what I'm talking about. But the class is not meant to be a broad survey of methods. Instead, our focus will be on mathematical concepts that help us understand what we can and can't trust these methods (and others) to do. Translating this into practice, we'll discuss what we should try to estimate, how we should do it, and what to expect when we do. Causal inference applications will be emphasized. Drawing exercises, computer visualization, and computer simulations will be used to get a feel for the material.

Goals

By the end of this course, students should be able to write code that fits curves to data via least-squares with shape-constrained and smooth regression models, predict the rate of convergence of least squares curve fits in general terms using localized gaussian width and in specific models using fourier analysis or chaining; and use this knowledge to make appropriate modeling choices themselves and evaluate those of others. While the terminology used in the machine learning literature is varied enough that I do not expect the class to be sufficient preparation to read recent papers on their own, their familiarity with the core concepts should be sufficient for them to understand the essential ideas if translated into appropriate terms. That is, students should be able to communicate with experts about what is going on in the field.

I do not expect students to understand everything we discuss perfectly. That is not how people learn this stuff. We arrive somewhere at the end of a class, and refine our understanding as we use it talking, reading, writing, coding, etc. And we're never done: professors make incorrect claims about this stuff and get corrected, sometimes during presentations on their own research, all the time. I encourage you to check this class out if you want to develop your understanding of the core ideas of modern machine learning and statistics, even if you're unsure of what you'll be able to develop it into by the semester's end.

Background Knowledge

You'll need a working understanding of the basic concepts of probability, linear algebra, and calculus. For probability, you'll have to get comfortable talking about events and their probabilities and random variables and their expected values and variances; independence of events and random variables; and the multivariate normal distribution. For linear algebra, likewise for talking about a basis, inner product, orthogonality, eigenvalues, and eigenvectors. And for calculus, we'll work with partial derivatives and calculate an easy integral here and there.

If you've taken QTM 220 and its prerequisites, you should be well-prepared. I'll review most of this as it comes up, so if there are a few gaps or some unfamiliar terminology, it won't be a big deal. If you'd like to do some reading to prepare in advance, take a look at Chapters 1-4 of Larry Wasserman's All of Statistics, a paper copy of which is available from the library. It covers more than you will need on probability. Chapter 6 and Section 8.1 of Nicholson's Linear Algebra with Applications covers enough linear algebra. If you've forgotten the details, or never learned them, that's fine: you will not need to calculate a difficult integral, diagonalize a matrix, or know what the Poisson distribution is.

Meeting Times

Class will meet on Mondays and Wednesdays from 4:00-5:15 in PAIS 235. I will hold office hours weekly at a time to be determined.

Readings

Aside from lecture slides and the solutions to in-class and homework exercises, you won't need to do any reading to follow what's going on in class. If you like books, Vershynin's High Dimensional Probability is a great read and covers many of the theoretical concepts we'll be talking about. It is, however, intended for graduate students. This is a good book to look at if you've enjoyed the class, feel confident about the material we've covered, and are looking for more breadth or depth.

Workload

Short problem sets will be assigned almost every week as homework. These will include problems meant to prepare you for the next week's in-class activities, so they'll be due the night before the next week's first class meeting. So that the solutions aren't delayed, I won't grant extensions. There will not be an exam or a large-scale project.

Collaboration on homework is encouraged. I prefer that each student write and turn in solutions in their own words, and think that it is often best that this writing is done separately, with collaboration limited to discussion of problems, sketching solutions on a whiteboard, etc. This will help you and I understand where you're at in terms of your proficiency with the material. These are not a test. I will work on them with you during my office hours if you want. That said, it's not necessary to come to hours if there's a problem you can't get. I encourage you to try all the problems, but it's fine to omit a problem or two from what you turn in. I'll post complete solutions soon after each assignment is due. Review them.

Tentative Schedule

	Week 1
W Jan 17	Lecture	Intro
	Homework	Intro to Convex Programming using CVXR
	Week 2
M Jan 22	Lab	Implementing Monotone Regression, Day 1
W Jan 24	Lab	Implementing Monotone Regression, Day 2
	Homework	Vector Spaces
	Week 3
M Jan 29	Lecture	Bounded Variation Regression
W Jan 31	Lab	Implementing Bounded Variation Regression
	Homework	Lipschitz Regression
	Week 4
M Feb 5	Lab	Rates and Modes of Convergence
W Feb 7	Lecture	Treatment Effects and the R-Learner
	Homework	Convexity and Convex Regression
	Week 5
M Feb 12	Lab	The Parametric R-Learner
W Feb 14	Lab	The Nonparametric R-Learner
	Homework	Reflection
	Week 6
M Feb 21	Discussion	Review
W Feb 19	Lecture	Least Squares in Finite Models, i.e., Model Selection
	Homework	Subgaussianity and Maximal Inequalities
	Week 7
M Feb 26	Lab	Understanding Model Selection
W Feb 28	Lecture	Least Squares in Infinite Models, i.e., Regression
	Homework	The Efron-Stein Inequality
	Week 8
M Mar 4	Lab	Understanding Gaussian Width. We'll draw.
W Mar 6	Lecture	Least Squares and non-Gaussian Noise
	Homework	The Gaussian Width of Simple Models
	Week 9
M Mar 11	No Class	Spring Break
W Mar 13	No Class	Spring Break
	Week 10
M Mar 18	Lecture	Least Squares and Misspecification
W Mar 20	Lecture	Least Squares and Population MSE
	Homework	Reflection
	Week 11
M Mar 25	Discussion	Review
W Mar 27	Lab	The Discrete Sobolev Model
	Homework	The Periodic Discrete Sobolev Model
	Week 12
M Apr 1	Lecture	The Periodic Sobolev Model, Fourier Series, and Gaussian Width
W Apr 3	Lab	Implementing Sobolev Regression using Fourier Series Approximations
	Homework	Interpreting Polynomial Regression using Sieves
	Week 13
M Apr 8	Lecture	Multivariate Sobolev Models and the Curse of Dimensionality
W Apr 10	Lab	Comparing Multivariate Sobolev Models
	Homework	Image Denoising
	Week 14
M Apr 15	Lecture	Bounding Gaussian Width using Covering Numbers
W Apr 17	Lecture	Bounding Gaussian Width via Chaining
	Homework	Covering Numbers for Monotone and BV Regression Models
	Week 15
M Apr 15	TBD
W Apr 17	TBD
	Homework	Reflection
	Week 16
M Apr 29	Review

Policies

Attendance

I would like to see you at the majority of our class meetings. That said, schedule conflicts and illness happen. Please do not come to class sick. I will record the lectures and post recordings soon after class. There is no need to explain your absences, but please try to inform me of them in advance of class meetings.

Grading

We'll practice ungrading. I'm required to submit grades at the end of the semester, but we'll waste as little time as possible thinking about that. This means that you'll be expected to attend labs and submit your homework on time, but you won't receive grades on them. Instead, we'll meet periodically to discuss your work, your learning goals, and the progress you're making toward them. You'll be expected to write a short reflection those weeks. We'll talk through what grade a given level of progress toward your goals warrants, so the grade you ultimately receive shouldn't be a surprise.

Accessibility and Accomodations

As the instructor of this course I endeavor to provide an inclusive learning environment. I want every student to succeed. The Department of Accessibility Services (DAS) works with students who have disabilities to provide reasonable accommodations. It is your responsibility to request accommodations. In order to receive consideration for reasonable accommodations, you must register with the DAS here. Accommodations cannot be retroactively applied so you need to contact DAS as early as possible and contact me as early as possible in the semester to discuss the plan for implementation of your accommodations. For additional information about accessibility and accommodations, please contact the Department of Accessibility Services at (404) 727-9877 or accessibility@emory.edu.

Writing Center

Tutors in the Emory Writing Center and the ESL Program are available to support Emory College students as they work on any type of writing assignment, at any stage of the composing process. Tutors can assist with a range of projects, from traditional papers and presentations to websites and other multimedia projects. Writing Center and ESL tutors take a similar approach as they work with students on concerns including idea development, structure, use of sources, grammar, and word choice. They do not proofread for students. Instead, they discuss strategies and resources students can use as they write, revise, and edit their own work. Students who are non-native speakers of English are welcome to visit either Writing Center tutors or ESL tutors. All other students in the college should see Writing Center tutors. Learn more, view hours, and make appointments by visiting the websites of the ESL Program and the Writing Center. Please review the Writing Center’s tutoring policies before your visit.

Honor Council

The Honor Code is in effect throughout the semester. By taking this course, you affirm that it is a violation of the code to cheat on exams, to plagiarize, to deviate from the teacher's instructions about collaboration on work that is submitted for grades, to give false information to a faculty member, and to undertake any other form of academic misconduct. You agree that the instructor is entitled to move you to another seat during examinations, without explanation. You also affirm that if you witness others violating the code you have a duty to report them to the honor council.