Regression Analysis (Fall 2022)

QTM 220 Section 1 with David A. Hirshberg

Description

This class is a modern introduction to regression analysis. We will cover linear regression and other widely used methods for fitting curves to data and the causal and statistical concepts we need to make meaningful and defensible claims based on them. We'll spend roughly equal time talking about mathematical concepts and practicing data analysis using the R programming language. Together, these will provide a foundation for future study in both methodology and substantive areas.

Background Knowledge

You'll need to differentiate multivariable functions, do some matrix arithmetic, think about orthogonality, interpret and calculate conditional and unconditional expected values, work with normal and asymptotically normal random variables, and write a little R code. If you've taken Linear Algebra, Multivariable Calculus, and QTM 150 and QTM 210 or similar classes, you should have all the background you need.

Meeting Times

Class will meet in White Hall 205 from 2:30-3:45 Mondays and Wednesdays and from 2:30-3:20 on Fridays. I will hold office hours weekly at a time to be determined.

Readings

We will read from Introduction to Statistical Learning (ISL) by James, Witten, Hastie, and Tibshirani in the first half of the semester and from Foundations of Agnostic Statistics (FAS) by Aronow and Miller in the second. These books are useful references, but what we will emphasize is fairly different, so I will be fairly selective about what I assign. Nonetheless, some content we will not discuss in class will be included for the sake of continuity. You will not be tested on it. Exam content will be drawn from lectures, labs, and homework.

Tentative Schedule

Week 1
Homework Working with matrices, vectors, and functions.
W Aug 24 Lecture Why we fit curves and what we do with our fits.
F Aug 26 Discussion Areas we're interested in for the final project.
Week 2
Reading ISL 3.1.1
M Aug 29 Lecture Least squares curve fits and predictions based on them. Properties of residuals. Curve shapes.
W Aug 31 Lab Fitting curves, making predictions, and summarizing them.
F Sept 1 Discussion Specific questions and data availability.
Week 3
Reading ISL 7.1-7.4
Homework Evaluating fit.
M Sept 5 No class Labor Day.
W Sept 7 Lecture More curve shapes. Problems with polynomial fits. Splines. Transformed outcomes.
F Sept 9 Lab Fitting curves better.
Week 4
Reading ISL 7.6
M Sept 12 Lecture Matching data to questions. Overlap. Weighted least squares and residuals.
W Sept 14 Lab Using weighted least squares to target specific questions.
F Sept 16 Discussion Data relevance.
Week 5
Reading ISL 3.2 and 3.3. Skip 3.2.2.
Homework Using real data.
M Sept 19 Lecture Multidimensional curve fitting. Main effects and interactions. Additive vs. isotropic models.
W Sept 21 Lab Working with temporal and spatial data.
F Sept 23 Group Work Fitting curves to answer your questions.
Week 6
M Sept 26 Lecture Sample splitting and least squares model selection.
W Sept 28 Lab Model selection.
F Sept 30 Group Work Testing your approach using simulated data.
Week 7
M Oct 3 Review
W Oct 5 Midterm Exam
F Oct 7 Exam Solution
Week 8
Reading FAS up to 2.2.3.
Homework Working with random variables and random vectors.
M Oct 10 No Class Fall Break.
W Oct 12 Lecture Probability. Expectations, conditioning, sampling, and gaussianity.
F Oct 14 Lab Visualizing probability distributions.
Week 9
Reading FAS 2.2.4 and 3.3.
M Oct 17 Lecture Sampling from populations. The infinite population approximation. Statistical and modeling error.
W Oct 19 Lab Regression in populations. Comparing true, pseudo-true, and estimated curves.
F Oct 21 Group Work Thinking through the potential impacts of misspecification.
Week 10
Reading FAS 7.1.1, 7.1.3-7.1.6, 7.2.2, and 7.2.6.
Homework The classical perspective.
M Oct 24 Lecture Causal inference. Potential outcomes, identification, and inverse probability weighting.
W Oct 26 Lab Inverse probability weighted least squares.
F Oct 28 Group Work Formulating and answering causal questions.
Week 11
Reading Continue last week's.
M Nov 31 Lecture Conditionally randomized experiments vs. observational studies. The linear probability model.
W Nov 2 Lab Using estimated inverse probability weights.
F Nov 4 Group Work Thinking through problems caused by confounding.
Week 12
Reading FAS 3.4.1-3.4.2, 3.4.4, and 4. Feel free to skip 4.3.6.
Homework Hypothesis testing.
M Nov 7 Lecture Least squares with gaussian errors as an approximation. The delta method.
W Nov 9 Lab Confidence intervals and coverage.
F Nov 11 Group Work Making statistical claims.
Week 13
Reading Continue last week's.
M Nov 14 Lecture Lecture. Least squares asymptotics.
W Nov 16 Lab Lab. Accuracy of asymptotic approximation: coverage with gaussian errors vs. without.
F Nov 18 Group Work Reporting statistical claims.
Week 14
M Nov 21 Review
W Nov 23 No Class Thanksgiving.
F Nov 25 No Class Thanksgiving.
Week 15
Reading None.
M Nov 28 Lecture Logit and log-linear models. Nonlinear least squares asymptotics.
W Nov 30 Lab Comparing nonlinear regression to linear regression on transformed outcomes.
F Dec 2 Group Work Appraising statistical claims. We'll have traded reports from our reporting exercise.
Week 16
M Dec 5 Final Exam
Exam Slot
TBD Project Presentations

Assignments and Evaluation

Problem sets will be assigned Monday roughly every other week. They are due by midnight on the Monday two weeks later. Collaboration is encouraged. I prefer that each student write and turn in solutions in their own words, and think that it is often best that this writing is done separately, with collaboration limited to discussion of problems, sketching solutions on a whiteboard, etc. I will post solutions to homework problems promptly. Review them. So that the solutions aren't delayed, I won't grant extensions on homework.

There will also be a final project: a data analysis project that should answer a substantive question. You'll work in small groups. Early in the semester, we'll use most of our Friday meetings to come up with a few good topics to choose from and form groups to work on them. Later, we'll use Fridays to work on that question in those groups. Each group will be expected to turn in a report and give a brief talk during the final exam slot.

Final grades will be based on the midterm and final exams (30% each), final project (30%), and completion of the homework and labs (10%).

Policies

Accessibility and Accomodations
As the instructor of this course I endeavor to provide an inclusive learning environment. I want every student to succeed. The Department of Accessibility Services (DAS) works with students who have disabilities to provide reasonable accommodations. It is your responsibility to request accommodations. In order to receive consideration for reasonable accommodations, you must register with the DAS here. Accommodations cannot be retroactively applied so you need to contact DAS as early as possible and contact me as early as possible in the semester to discuss the plan for implementation of your accommodations. For additional information about accessibility and accommodations, please contact the Department of Accessibility Services at (404) 727-9877 or accessibility@emory.edu.

Attendance
Class attendance is not mandatory and will not affect your grade. Schedule conflicts and illness happen. There is no need to explain your absences or inform me of them in advance of class meetings. And please do not come to class sick. I will record the lectures and post them on the class canvas site as soon after class as possible. In-class exercises are harder to replicate at home, but I encourage you to work independently on any lab you miss and check in with your group if you miss a group exercise.

Writing Center
Tutors in the Emory Writing Center and the ESL Program are available to support Emory College students as they work on any type of writing assignment, at any stage of the composing process. Tutors can assist with a range of projects, from traditional papers and presentations to websites and other multimedia projects. Writing Center and ESL tutors take a similar approach as they work with students on concerns including idea development, structure, use of sources, grammar, and word choice. They do not proofread for students. Instead, they discuss strategies and resources students can use as they write, revise, and edit their own work. Students who are non-native speakers of English are welcome to visit either Writing Center tutors or ESL tutors. All other students in the college should see Writing Center tutors. Learn more, view hours, and make appointments by visiting the websites of the ESL Program and the Writing Center. Please review the Writing Center’s tutoring policies before your visit.

Honor Council
The Honor Code is in effect throughout the semester. By taking this course, you affirm that it is a violation of the code to cheat on exams, to plagiarize, to deviate from the teacher's instructions about collaboration on work that is submitted for grades, to give false information to a faculty member, and to undertake any other form of academic misconduct. You agree that the instructor is entitled to move you to another seat during examinations, without explanation. You also affirm that if you witness others violating the code you have a duty to report them to the honor council.