How linear?
Ani Adhikari and Philip Stark
Statistics 2.1X
Lecture 6.2
1/7
y
y
0
0
0
-1
-1
-1
-2
-2
-2
-3
-3
y
1
1
1
2
2
2
3
3
3
How linear?
-2
-1
0 x
Ani Adhikari and Philip Stark
1
2
-2
-1
0 x
Statistics 2.1X
1
2
-2
-1
0
1
2
x
Lecture 6.2
1/7
y
y -3
-3
-2
-2
-2
-1
-1
-1
0
0
0
y
1
1
1
2
2
2
3
3
3
How linear?
-2
-1
0 x
1
2
-2
-1
0 x
1
2
-2
-1
0
1
2
x
correlation coefficient (r ): a number between −1 and 1; it measures linear association, that is, how tightly the points are clustered about a straight line.
Ani Adhikari and Philip Stark
Statistics 2.1X
Lecture 6.2
1/7
Example
Data: (1, 2) (2, 3) (3, 1) (4, 6) (5, 6)
Ani Adhikari and Philip Stark
Statistics 2.1X
Lecture 6.2
2/7
Ani Adhikari and Philip Stark
4 y 3 2 1
Data: (1, 2) (2, 3) (3, 1) (4, 6) (5, 6)
5
6
Example
1
2
3
4
5
x
Statistics 2.1X
Lecture 6.2
2/7
4 y 3 2 1
Data: (1, 2) (2, 3) (3, 1) (4, 6) (5, 6)
5
6
Example
1
2
3
4
5
x
Expect r to be positive but not 1.
Ani Adhikari and Philip Stark
Statistics 2.1X
Lecture 6.2
2/7
Calculating r
Ani Adhikari and Philip Stark
Statistics 2.1X
Lecture 6.2
3/7
Calculating r
x 1 2 3 4 5 mean = 3 SD = 1.41
y 2 3 1 6 6 mean = 3.6 SD = 2.06
Ani Adhikari and Philip Stark
x in std. units −1.41 −0.71 0 0.71 1.41
Statistics 2.1X
y in std. units −0.78 −0.29 −1.26 1.16 1.16
product of std. units 1.10 0.21 0 0.82 1.64 mean = 0.75 =r
Lecture 6.2
3/7
The formula in two languages Formula for r 1. Convert both lists to standard units. 2. Multiply corresponding pairs of standard units. 3. r is the average of the products.
Ani Adhikari and Philip Stark
Statistics 2.1X
Lecture 6.2
4/7
The formula in two languages Formula for r 1. Convert both lists to standard units. 2. Multiply corresponding pairs of standard units. 3. r is the average of the products.
For those who like math notation and have read the algebra supplement: If the data are (xi , yi ), 1 ≤ i ≤ n, then n
1 X xi − µx yi − µy r = n σx σy i=1
Ani Adhikari and Philip Stark
Statistics 2.1X
Lecture 6.2
4/7
Properties of r
1. The calculation uses only standard units. So r is a pure number with no units.
Ani Adhikari and Philip Stark
Statistics 2.1X
Lecture 6.2
5/7
Properties of r
1. The calculation uses only standard units. So r is a pure number with no units. 2. −1 ≤ r ≤ 1. Trust me.
Ani Adhikari and Philip Stark
Statistics 2.1X
Lecture 6.2
5/7
Properties of r
1. The calculation uses only standard units. So r is a pure number with no units. 2. −1 ≤ r ≤ 1. Trust me. The extreme cases: r = −1 is when the scatter is a perfect straight line sloping down; r = 1 is when the scatter diagram is a perfect straight line sloping up.
Ani Adhikari and Philip Stark
Statistics 2.1X
Lecture 6.2
5/7
Properties of r
1. The calculation uses only standard units. So r is a pure number with no units. 2. −1 ≤ r ≤ 1. Trust me. The extreme cases: r = −1 is when the scatter is a perfect straight line sloping down; r = 1 is when the scatter diagram is a perfect straight line sloping up. 3. It doesn’t matter if you switch the variables x and y ; r stays the same.
Ani Adhikari and Philip Stark
Statistics 2.1X
Lecture 6.2
5/7
20 0
10
final
30
40
Switching axes doesn’t affect linearity
0
5
10
15
20
midterm
Ani Adhikari and Philip Stark
Statistics 2.1X
Lecture 6.2
6/7
20 15 0
0
5
10
midterm
20 10
final
30
40
Switching axes doesn’t affect linearity
0
5
10
15
20
midterm
Ani Adhikari and Philip Stark
0
10
20
30
40
final
Statistics 2.1X
Lecture 6.2
6/7
Linear transformations
4. Adding a constant to one of the lists just slides the scatter diagram, so r stays the same.
Ani Adhikari and Philip Stark
Statistics 2.1X
Lecture 6.2
7/7
Linear transformations
4. Adding a constant to one of the lists just slides the scatter diagram, so r stays the same. 5. Multiplying one the lists by a positive constant does not change standard units, so r stays the same.
Ani Adhikari and Philip Stark
Statistics 2.1X
Lecture 6.2
7/7
Linear transformations
4. Adding a constant to one of the lists just slides the scatter diagram, so r stays the same. 5. Multiplying one the lists by a positive constant does not change standard units, so r stays the same. 6. Multiplying just one (not both) of the lists by a negative constant switches the signs of the standard units of that variable, so r has the same absolute value but its sign gets switched.
Ani Adhikari and Philip Stark
Statistics 2.1X
Lecture 6.2
7/7