- Section 001, Mon/Wed 1:30-2:50 PM in F55 JMHH
- Section 002, Mon/Wed 3:00-4:20 PM in F55 JMHH
Email: ebfox@wharton.upenn.edu
Office: 466 JMHH
Office hours: Thurs 10:00-11:00 AM
Teaching Assistants: Yang Jiang (yajiang@wharton) and Ville Satopaa (satopaa@wharton)
Office: 434 JMHH and 431.3 JMHH
Office hours: Mon 4:30-5:30 PM (4th fl JMHH), Tues 4:30-5:30 PM (F45 Huntsman, except November 15, when we have F65 Huntsman) and 8:00-9:00 PM (G94 JMHH)
Recitations: Thursdays 7:00-8:00 in 345 JMHH
Course Overview
This course offers an advanced undergraduate level exploration of statistical techniques for data analysis, with an emphasis on developing computational tools and an understanding of when and how to use them. The latter will require a level of mathematical maturity as we examine the theoretical underpinnings of the explored methods. Interpretation of the results and analysis of assumptions is a key part of the course. As such, the course is appropriate for mathematically inclined students who wish to learn hands-on computational techniques for data analysis.
Topics include (1) collection, summary and display of data, (2) estimation, hypothesis testing, and confidence statements, and (3) simple and multiple linear regression. If time permits, we will also discuss variable selection and logistic regression. Students will experiment with these ideas on data examples using statistical software.
The official prerequisite is Statistics 430. The effective prerequisite is fluency with basic quantitative probabilistic reasoning and analysis (e.g., probability distributions and densities; jointly distributed random variables; conditional probability; independence, correlation, and covariance; normal and binomial distributions), together with the kind of mathematical maturity that often comes from taking at least one higher level undergraduate subject that has a significant mathematical component. Students are not expected to have knowledge of the statistical computing language R, though prior programming experience will be helpful.
Textbooks
- [Required:] Statistics and Data Analysis: from Elementary to Intermediate, by A.C. Tamhane and D.~D.~Dunlop, Prentice Hall, 2000.
- [Highly Recommended:] The Statistical Sleuth: A Course in Methods of Data Analysis , by F. Ramsey and D. Schafer, Duxbury Press, 2002.
Grading
- Homework assignments: 20% (with the lowest score dropped)
- Midterm exam: 30%
- Final exam: 50%
The final grade in the course is based upon our best assessment of your understanding of the material during the semester. Roughly, the weights used in grade assignment will be:
However, as always, other factors such as contributions to the lecture discussion and other interactions can make a significant difference in the final grade.
Statistical Computing Software
The statistical computing software R (version 2.10.0 or higher) will be used in the course. It is free, and can be downloaded at the R-project website: R-ProjectExams
- One in-class midterm exam: Wednesday, October 26. Location:
Section 001
Last name A - L, Room 240
Last name M - Z, Room 245
Section 002
Last name A - L, Room F55
Last name M - Z, Room F94
Coverage: Chapters 4-8, not multiple choice - Final exam schedule:
1. Date + Time - Friday, December 16th from 6-8pm
2. Location - (go to registered section)
Sec 001 in F85 JMHH
Sec 002 in F95 JMHH
3. Cheat sheet - two 8.5x11.5" sheets of paper, front and back, hand-written, not photocopied
4. Calculator - scientific (non-graphing/programmable) calculators only
5. Coverage - all lectures, but more emphasis on 2nd half of material
Final Practice Problems Final Practice Problem - Solutions
- Both exams will be closed book. However, you will be allowed to bring a certain number 8.5 x 11-inch sheets of hand-written notes. Details will be provided two weeks prior to each exam.
Cheat Sheet: One 8.5" x 11" sheet, front and back, of hand-written notes (not photocopied)
Calculator: scientific calculator only, no graphing calculators or other programmable devices.
Review:
Recitation, Thurs. Oct. 20
Class, Mon. Oct. 24
Office hours: extended on Tues., Oct. 25th: 4:30-5:30pm (4th floor JMHH) and 7-9pm (F94 Huntsman).
Midterm Solutions
Homework
1. HW1 HW1 Data HW1 R Template HW1 Solutions2. HW2 HW2 R Template HW2 Solutions
3. HW3 HW3 R Template HW3 Solutions
4. HW4 HW4 Data HW4 R Template HW4 Solutions
5. HW5 HW5 Data: corneal HW5 Data: rainfall HW5 R Template HW5 Solutions
6. HW6 HW6 R Template HW6 Data: oldfaithful HW6 Data: usimr HW6 Solutions
7. HW7 HW7 R Template HW7 Data: anscombe HW7 Data: cars HW7 Data: Chernobyl HW7 Data: diamonds HW7 Solutions
8. HW8 HW8 Data: GPA HW8 Data: PIQ HW8 Data: beam HW8 Solutions
9. HW9 HW9 Data: Car89 HW9 Data: FedEx HW9 Data: HousePrices HW9 Data: lifeexp HW9 Data: Phila HW9 Data: stocks HW9 Data: UN2 HW9 Solutions
10. HW10 HW10 Data: crimes HW10 Data: Orings HW10 Data: highway HW10 Data: lifeexp HW10 Solutions
Recitations
1. Brief R Tutorial2. Notes #2
3. Notes #3
4. Notes #4
5. Notes #5
6. Midterm Review
7. Notes #7
8. Notes #8
9. Notes #9 Notes #9 R codes