ECON1320 Semester Two FINAL Summary Table of Contents ECON1320 Semester Two FINAL Summary .............................................................. 1 Lecture Five – Multiple Linear Regression A ........................................................... 1
Lecture Five – Multiple Linear Regression A The multiple linear regression technique uses more than one independent variable to explain the variation in the dependent variable of interest. 𝑌𝑖 = 𝛽! 𝑋!! + 𝛽! 𝑋!! + ⋯ + 𝛽! 𝑋!" + 𝜀! 𝑌𝑖 = 𝑏! + 𝑏! 𝑥! + 𝑏! 𝑥! + ⋯ + 𝑏! 𝑥! • 𝛽! = Slope coefficient of 𝑋!" , holding other variables constant. • When interpreting one of the slopes in MLR, we should take into account the effect of the other variables…i.e. 𝛽! represents the change in the mean of Y per unit change in 𝑋!! , taking into account the effect of 𝑋!!, 𝑋!! 𝑋!!, etc. • Hence 𝛽! is called a net/partial regression coefficient • Note: We need to take care when interpreting the Y intercept outside the range of the data (extrapolation). • To determine whether there is a significant relationship between the dependent variable and the entire set of independent variables, we use: a) Overall F test, and b) R2 and adjusted R2 •
a) H0: 𝛽! = 𝛽! =…= 𝛽! = 0 (i.e. none of the independent variables are significantly related to Y) H1: At least one 𝛽! ≠ 0 (i.e. at least one of the independent variables is significantly related to Y) Reject H0 if Fcalc > Fcrit = 𝐹!,(!,!!!!!) (significance F in ANOVA = p-‐value) !"#
𝐹𝑐𝑎𝑙𝑐 = !"# =
!"!"# !"!""
=
!!" ! !!" !!!!!
-‐-‐ on formula sheet
b) R2: The proportion of variance in Y that is explained by the set of independent variables included in the regression model. !!" !!" ! 𝑅 = !!" = 1 – !!" 0 ≤ 𝑅! ≤ 1 -‐-‐ first part on formula sheet where SST = 𝑌𝑖 − 𝑌 !
•
•
As additional independent variables are added to a MLR, the SSR will increase (but not SST), even if the additional independent variables are not an important predictor variable (i.e. adds no significant information). It will never decrease by the additional of variables. Therefore, R2 may yield an inflated figure and hence is misleading. Use R2 adjusted. 𝑅! !"# = 1 −
•
Adjusted 𝑅! ≤ 𝑅! 𝑏𝑢𝑡 𝑁𝐸𝑉𝐸𝑅 >
Regression (R) Residual (E)
1 − 𝑅!
!!!
•
=1−
!!!!!
!!" !!!!! !!" !!!
-‐-‐ on formula sheet
df k
SS SSR
MS MSR = SSR/k
n-‐k-‐1
SSE
MSE = SSE/n-‐ k-‐1
F Fcalc = MSR/MSE
Total (T) n-‐1 SST • Individual significance tests can be computed for each regression coefficient using a t-‐test. !"!!" • 𝑡𝑐𝑎𝑙𝑐 = !"# •
•
H0: No Contribution (𝛽𝑖 = 0) – if you are testing for a positive or negative relationship, can use ≤ 𝑜𝑟 ≥ Additionally, you could calculate the confidence interval. If 0 is not in the interval then there is a relationship. Always start the interpretation of the confidence interval with “Holding constant the independent variables…” 𝑏! ± 𝑡!,!!!!! ×𝑆!! !
• •
Se is the standard deviation around the predication line. !" 𝑆𝑏! = !
•
𝑆𝑒 =
•
Se is measured in the same unit as the dependent variable. The magnitude of Se can be judged relative to the average Y value.