---------------------------------------------------------------- Lecture 5, 2018/09/28: RECAP: - Make the first and second order assumptions for a linear model with response vector y (nx1) and regressor matrix X (nx(p+1)). Let R be an orthogonal NxN matrix and let z = R y. What kind of model do you get for the vector z as the response? What are the vectors of OLS coefficient estimates, fitted values, residuals? - Geometrically, what is the meaning of "degrees of freedom"? - Statistically, what do "degrees of freedom" tell you about total variances? - If you add one more regressor to your regression, what happens to the variability (total variance) of the vectors of fitted values and residuals? - Do degrees of freedom depend on whether you added a good or a bad regressors? - Is it true that E[RMSE] = sigma ? Yes / no? - Derive the proper unbiasedness property. - Is it desirable? If yes, why. If no, why not? - In the derivation of this property, what assumptions did you use? - Recall the 1st and 2nd order properties of the OLS estimate. - How can you see that the OLS estimate gets more precise with more data point? - What happens to the self-influence value H_11 as n --> Inf? ROADMAP: Regression adjustment and its applications ---------------------------------------------------------------- Lecture 4, 2018/09/21: RECAP: - Quiz 1 review - What is the range of the hat diagonal elements H_ii ? - Self-influence? - Budget consideration of V[r_i] vs V[yhat_i] ? - Definition of trace? - Trace of H ? - Total variance? - Budget considerations for total variance of r vs yhat? ROADMAP: - Estimating sigma^2 - Long-run properties of the X matrix - Interpretation of the hat matrix diagonal as measure of outlyingness - Regression adjustment - Application 1 of adjustment: centering - Application 2 of adjustment: reduction of multiple to simple regression ---------------------------------------------------------------- Lecture 3, 2018/09/14: ORG: - Next time (9/21) there will be a short quiz at the beginning of class. RECAP: * OLS Estimates: betahat, yhat, residuals ... * Linear models assumptions, 1st & 2nd order: ... * Vector random variables / random vectors, 1st & 2nd order properties - Definitions: . 1st order moments: ... . 2nd order moments: ... - What kind of averaging is going on in E[...] and V[...] in regression? ... - 1st order properties under affine operations: ... - 2nd order properties under affine operations: ... * 1st & 2nd order properties of betahat: ... ... ---------------------------------------------------------------- Lecture 2, 2018/09/07: ORG: - This class has just one purpose... RECAP: REMEDIAL R - Basic atomic data types and their purposes: . . . . - Missing value types and their algebra: . . . . - Composite data types and their purposes: . . . . - Types of indexed vector/matrix/array/dataframe access: . . . . - Concepts: . High vs low level languages: characteristic and examples Low: High: . Coercion: ... Examples: What is the sum and the mean of a logical vector? x <- runif(100)<0.25 sum(x) mean(x) . Hash tables: What are they and where are they in R? ... . Recycling: What is it? cbind(1:10, 1:2) cbind(1:10, 1:3) # What is different? paste(c("A","B"), letters, sep="") matrix(c("+","-"), nrow=5, ncol=5) - Loops: good and bad uses - Regular expressions: what are they? - Plotting: Invest in fine-tuning plots for your papers. Don't waste space (base-R plots have wasteful defaults) Don't allow plot content to bleed into the border (base-R plots do bleeding) BABY INFERENCE: - What is the meaning of "variability" in estimates of a mean? median? standard deviation? - What is the meaning of probability when applied to estimates? - What is the difference between standard deviation and standard error? - What makes mean and max so different as estimates? - What is the relation between retention intervals of statistical tests (RIs) and confidence intervals (CIs)? ROADMAP: - Finish baby inference - Next: Basics of linear models (half-hearted)