Business Analytics Predictive Modeling using Linear Regression

Report 2 Downloads 115 Views
Business Analytics Predictive Modeling using Linear Regression

© Pristine

© Pristine – www.edupristine.com

4.b.Explained and Unexplained Variation

y yi

•SSE = Sum of squared errors

 SSE = (yi - yi )2

SST = Total Sum of Squares

 y

_

 y

_ y

SST = (yi - y)2  _ 2 RSS = (yi - y) •RSS = Regression sum of squares

Xi © Pristine

_ y

x 1

4.b.Explained and Unexplained Variation (Cont…)  SST = Total sum of squares • Measures the variation of the yi values around their mean y

(continued)

 SSE = Sum of squared errors • Variation attributable to factors other than the relationship between x and y  SSR = Regression sum of squares • Explained variation attributable to the relationship between x and y

© Pristine

2

4.b.Explained and Unexplained Variation (Cont…) Total variation is made up of two parts:

SST  SSE  Total sum of Squares

Sum of Squared Errors

SST   ( y  y )

2

RSS Regression Sum of Squares Also known as Square Sum of Regression SSR

SSE   ( y  yˆ ) 2 SSR   ( yˆ  y ) 2

Where:



y = Average value of the dependent variable



y = Observed values of the dependent variable



yˆ = Estimated value of y for the given x value

© Pristine

3

4.b.Coefficient of Determination, R2  The coefficient of determination is the portion of the total variation in the dependent variable that is explained by variation in the independent variable

 The coefficient of determination is also called R-squared and is denoted as R2

SSR R  SST 2

© Pristine

where

0  R 1 2

4

4.b.Coefficient of Determination, R2 (Cont…)  Coefficient of determination

(continued)

SSR sum of squaresexplained by regression R   SST total sum of squares 2

 Note: In the single independent variable case, the coefficient of determination is

 Where:

R r 2

2

• R2 = Coefficient of determination

• r = Simple correlation coefficient

© Pristine

5

4.b.Examples of Approximate R2 Values

y R2 = 1 Perfect linear relationship between x and y:

R2

=1

x

100% of the variation in y is explained by variation in x

y

R2 © Pristine

= +1

x 6

4.b.Examples of Approximate R2 Values (Cont…)

y

0 < R2 < 1 Weaker linear relationship between x and y:

x

Some but not all of the variation in y is explained by variation in x

y

x

© Pristine

7

4.b.Examples of Approximate R2 Values (Cont…)

R2 = 0

y

No linear relationship between x and y: The value of Y does not depend on x. (None of the variation in y is explained by variation in x)

R2 = 0

© Pristine

x

8

4.b.Limitations of Regression Analysis  Parameter Instability - This happens in situations where correlations change over a period of time. This is very common in financial markets where economic, tax, regulatory, and political factors change frequently.

 Public knowledge of a specific regression relation may cause a large number of people to react in a similar fashion towards the variables, negating its future usefulness.  If any regression assumptions are violated, predicted dependent variables and hypothesis tests will not hold valid.

© Pristine

9

Thank you!

Pristine 702, Raaj Chambers, Old Nagardas Road, Andheri (E), Mumbai-400 069. INDIA www.edupristine.com Ph. +91 22 3215 6191

© Pristine

© Pristine – www.edupristine.com