PARTIAL AND SEMI-PARTIAL CORRELATIONS Consider two equations from the homework: Predicted Sleep = 10.71412 – 0.11702(SWL) Predicted Sleep = 10.93636 – 0.13302(SWL) + 0.01002(Work Hours) Why is the regression weight for SWL in equation #1 different from the weight in equation #2? What is the difference in the meaning of these weights? The weight in equation #2 is called a partial regression weight because Work hours is partialled from it. It expresses the relationship between SWL and sleep, while controlling for hours of work. This is a type of statistical control. In addition to partial regression weights, we can express relationships among variables while statistically controlling other variables by using partial and semi-partial correlation coefficients.
Partial and Semi-Partial Correlations 2
Definition A Partial Correlation Coefficient is the PPMCC between two variables from which the effects of one (or more) other variables have been removed (partialled). For example, we select a random sample of kids from Hillsborough County Public Schools and compute the correlation between children’s height in inches and scores on a mathematics test. Suppose r = .77…are we surprised? A third variable (age) is “affecting” both height and math scores – that is, the relationship between height and math scores is confounded by student age. Partial correlation lets us examine the relationship between height and math scores while statistically controlling for age.
Partial and Semi-Partial Correlations 3
Hypothetical Correlation Matrix: Math Height Scores (X1) (X2)
Age (X3)
Math Scores (X1)
1.00
.77
.95
Height (X2)
.77
1.00
.80
Age (X3)
.95
.80
1.00
The partial correlation between variables X1 and X2 while controlling X3 can be expressed as a function of these bivariate correlations: r12.3
r12 r13 r23 1 r132 1 r232
For these data, r12.3
.77 .95 .80 1 .952
.77 .76 .053 2 .31225 .60 1 .80
The partial correlation (.053) is nearly zero, although the zero-order correlation is quite large (.77).
Partial and Semi-Partial Correlations 4
We say that the correlation between height and math scores is largely a spurious correlation – it appears in the data because of the influence of a third variable. Once the third variable is controlled, the correlation disappears.
Why does student height not explain math achievement? Why did we think we needed to look for a confounding variable? Why does age “explain” both math achievement and height? “Theory” – informal, intuitive kind of theory, but theory none-the-less.
Partial and Semi-Partial Correlations 5
PARTIAL CORRELATION PRACTICE From last week’s homework: What is the correlation between Hours of Sleep and Hours of Work while statistically controlling for SWL? r12.3
Although the zero-order correlation between hours of sleep and hours of work is negative and quite strong (r12 = -.76), the partial correlation is positive and quite weak (r12.3 = .13).
Partial and Semi-Partial Correlations 6
WITH WHAT DO RESIDUALS CORRELATE? The residuals from a regression analysis are completely uncorrelated with any of the regressors that gave rise to those residuals. data one; input idn 1-2 sleep 4-6 swl 8-9 work 11-12; * +---------------------------------------+ Use sample regression equations to calculate predicted sleep and residuals +---------------------------------------+; slphat1 = 10.714121 - (0.117016*SWL); resid1 = sleep - slphat1; slphat2 = 10.936363 - (0.133019*SWL) + (0.010022*WORK); resid2 = sleep - slphat2; cards; ; proc corr; var sleep swl work slphat1 slphat2 resid1 resid2; title2 'With What do residuals Correlate?'; run;
Partial and Semi-Partial Correlations 7
With What do residuals Correlate? Pearson Correlation Coefficients, N = 10 Prob > |r| under H0: Rho=0 sleep