Interpreting coefficients - both natural log (special case) ln(Y ) = β0 + β1 ⋅ ln(X) + ϵ, where ϵ ∼ N (0, σϵ ) E[ln(Y ) ln(X)] = β0 + β1 ⋅ ln(X) E[ln(Y ) ln(X)+1 ] = β0 + β1 ⋅ (ln(X) + 1) β1 = E[ln(Y ) ln(X)+1 ] − E[ln(Y ) ln X ] OR (when X and Y are both transformed using natural log): β1 = percent change in Y for each 1% change in X
DataCamp
Inference for Linear Regression
INFERENCE FOR LINEAR REGRESSION
Let's practice!
DataCamp
Inference for Linear Regression
INFERENCE FOR LINEAR REGRESSION
Multicollinearity Jo Hardin Professor, Pomona College
Amount vs. coins - linear model lm(Amount ~ Coins, data = change) %>% tidy() # term estimate std.error statistic p.value # 1 (Intercept) 0.1449 0.0902 1.61 1.13e-01 # 2 Coins 0.0945 0.0063 14.99 6.01e-22
Inference for Linear Regression
DataCamp
Amount vs. small coins - plot
Inference for Linear Regression
DataCamp
Amount vs. small coins - linear model lm(Amount ~ Small, data = change) %>% tidy() # term estimate std.error statistic p.value # 1 (Intercept) 0.4225 0.1244 3.40 1.22e-03 # 2 Small 0.0989 0.0118 8.38 1.10e-11
Inference for Linear Regression
DataCamp
Amount vs. coins and small coins ^ Amount = −0.00554 + 0.25862 ⋅ Coins − 0.21611 ⋅ Small Coins lm(Amount ~ Coins + Small, data = change) %>% tidy() # term estimate std.error statistic p.value # 1 (Intercept) -0.00554 0.02735 -0.202 8.40e-01 # 2 Coins 0.25862 0.00682 37.917 3.95e-43 # 3 Small -0.21611 0.00864 -25.021 4.17e-33
Inference for Linear Regression
DataCamp
Inference for Linear Regression
INFERENCE FOR LINEAR REGRESSION
Let's practice!
DataCamp
Inference for Linear Regression
INFERENCE FOR LINEAR REGRESSION
Multiple linear regression Jo Hardin Professor, Pomona College
Price on bed and bath lm(log(price) ~ log(bath) + bed, data=LAhomes) %>% tidy() # term estimate std.error statistic p.value # 1 (Intercept) 11.965 0.0384 311.67 0.00e+00 # 2 log(bath) 1.076 0.0465 23.14 2.38e-102 # 3 bed 0.189 0.0193 9.82 4.01e-22
Inference for Linear Regression
DataCamp
Large model on price lm(log(price) ~ log(sqft) + log(bath) + bed, data=LAhomes) %>% tidy() # term estimate std.error statistic p.value # 1 (Intercept) 1.5364 0.2894 5.310 1.25e-07 # 2 log(sqft) 1.6456 0.0454 36.215 6.27e-210 # 3 log(bath) 0.0165 0.0452 0.365 7.15e-01 # 4 bed -0.1236 0.0167 -7.411 2.03e-13
Inference for Linear Regression
DataCamp
Inference for Linear Regression
INFERENCE FOR LINEAR REGRESSION
Let's practice!
DataCamp
Inference for Linear Regression
INFERENCE FOR LINEAR REGRESSION
Summary Jo Hardin Professor, Pomona College
DataCamp
Linear regression as model it estimates an underlying population model it might be linear or might need variable transformations all of LINE conditions should be checked other variable relationships should be carefully considered
Inference for Linear Regression
DataCamp
Inference for Linear Regression
Linear regression as an inferential technique hypothesis testing using a mathematical model (t-tests) hypothesis testing using randomization tests confidence intervals using a mathematical model confidence intervals using bootstrapping