dfB SSB MSB

Report 5 Downloads 337 Views
ANALYSIS  OF  VARIANCE  (ANOVA)   INTRODUCTION   • •



More  than  two  groups  (J  >  2),  e.g.  treatment  vs.  standard  treatment  vs.  control   We  could  compare  each  pair,  but  running  multiple  t-­‐tests  inflates  the  Type  I  error  rate   o I.e.  if  one  group  is  “atypical”  (not  representative,  produces  an  affect  that  doesn’t  actually  exist  in  the   population)  by  chance,  this  produces  two  type  I  errors   o ANOVA  simultaneously  compares  J  means  in  one  omnibus  test,  keeping  Type  I  error  rate  to  α   o Assumptions  are  the  same  as  for  t-­‐test   “One-­‐way”  analysis  of  variance  "  just  one  independent  variable  

HYPOTHESES   •



Null  hypothesis:  H0:  µ1  =  µ2  =  µ3  =  …  =  µJ  (for  J  groups,  or  levels,  J  ≥  2)   o “Nothing  systematic  happening”,  samples  are  from  the  same  population,  no  differential  treatment   effects,  all  means  are  equal,  no  differences  in  the  dependent  variable  as  a  function  of  the  IV   Alternate  hypothesis:  HA:  not  H0   o At  least  one  mean  is  different  (not  directional)  

VARIANCE   •



We  want  to  compare  means,  so  why  analysis  of  variance?   o Why  do  scores  vary?  i.e.,  why  does  Person  A  in  Group  1  have  different  score  to  Person  B  in  Group  2?   o 1.  Because  they  are  experiencing  different  “treatments”  –  between  groups   o 2.  For  other  reasons  not  accounted  for  –  between  groups  and  within  groups   Within  group  variability   o Individual  differences,  “random”  error,  basically,  any  variable  not  accounted  for  in  the  analysis   o Because  of  the  assumption  of  homogeneity  of  variance,  we  estimate  within-­‐group  variability  using   pooled  variance…  

=



j

N−J

  o =  the  estimate  of  common  population  variance  based  on  variability  within  groups   o SSW  =  the  sum  of  all  the  SS  (comparing  each  score  to  its  group  mean)   o The  degrees  of  freedom  within  groups:  dfW  =  N  –  J   Between  group  variability     o Group  means  vary  around  the  grand  mean  (just  as  raw  scores  vary  around  group  means)  

Grand mean = X G = o

sum of all scores N  

Variance  between  means  =  sum  of  squared  deviations  of  group  means  from  grand  mean    

SSB MSB = dfB  



∑ SS

n ∑ (X J − X G )

2

=  

J −1

    o SSB  =  n  times  the  sum  of  squared  deviations  of  group  means  from  grand  mean  (comparing  each   group  mean  to  the  grand  mean)   o The  degrees  of  freedom  between  groups:  dfB  =  J  –  1     o If  all  means  are  equal,  MSB  =  0,  F  =  0   Variability  if  H0  is  true  /  not  true   o If  the  population  means  are  different  (i.e.,  H0  is  not  true),  the  sample  means  will  still  differ  because   of  sampling  variability,  but  also  because  of  systematic  group  differences   o If  H0  is  true,  MSB  =  error  variance  (and  should  be  about  the  same  as  MSW)   o If  H0  is  not  true,  MSB  =  error  variance  +  treatment  effect  (and  should  be  greater  than  MSW)  

F  RATIO   •

Calculating  the  F  ratio  is  our  goal:  

F=

• • •

MSB between group variance treatment + error = = MSW   within group variance   error  

o If  H0  is  true,  then  treatment  =  0  (F  =  1)   o If  H0  is  not  true,  then  treatment  >0  (F  >1)   The  F  tables:  we  now  have  two  degrees  of  freedoms   o 5  percent  points  =  0.05  (use  this  one  unless  told  otherwise),  1  percent  points  =  0.01   Compare  observed  F  to  critical  F   o If  observed  F  >  critical  F,  reject  H0   Anova  Summary  Table:  

 

EFFECT  SIZE   •

We  have  multiple  means,  so  effect  size  is  sums  of  squares  (between)  divided  by  sums  of  squares  (total):    

η2 =

SSB SST  

CALCULATING  DEGREES  OF  FREEDOM   •

Number  of  observations  minus  number  of  constraints,  e.g.  F(5,  60)   o J  –  1  =  5  "  so  J  =  6   o N  –  J  =  60  "  so  N  =  66   o Therefore  there  are  66  participants  and  6  different  groups   o If  n’s  are  equal,  there  are  11  participants  per  group  

MULTIPLE  COMPARISONS   •







Omnibus  test:     o Observed  F  =  6.28  >  Critical  F  =  2.84,  Reject  H0   o Conclusion:  Amount  of  bacteria  retained  appears  to  depend  on  whether  and  how  a  beard  is  washed,   F(3,  44)  =  6.28,  p  <  .05.   o The  Type  I  error  rate  is  .05  because  there  is  only  one  inference,  i.e.,  that  the  group  means  are  not  all   the  same.  This  is  unsatisfactory.    Where  are  the  differences?  How  can  we  test  other  differences   without  inflating  the  Type  I  error  rate?   o "  we  use  multiple  comparisons   Tukey’s  HSD  (Honestly  Significant  Difference)   o Test  all  pairwise  comparisons  while  maintaining  the  Type  I  error  rate  at  the  specified  level   o e.g.,  the  experiment  gets  an  α  of  .05  to  share  (experiment-­‐wise  error  rate,  EER),  rather  than  each   decision  getting  .05  (decision-­‐wise  error  rate,  DER)   o Conclusion:  Pairwise  comparisons  using  the  Tukey  HSD  procedure  at  the  .05  level  revealed  that   bacteria  count  was  significantly  lower  for  no  beard  group  when  compared  with  the  other  three   groups.  Bacteria  count  after  splash  wash,  shower  stream  wash  and  no  wash  did  not  differ   significantly  from  each  other.   Post  hoc  tests   o q,  the  studentised  range  statistic   o Considers  how  often  the  largest  difference  between  means  would  be  significant  if  omnibus  H0  is  true   o Not  necessarily  “post  hoc”   Omnibus  test  versus  multiple  comparisons;  it  is  possible  that:   o the  omnibus  test  is  significant  but  no  pairwise  comparisons  are   o the  omnibus  test  is  non-­‐significant  but  some  pairwise  comparisons  are  

FACTORIAL  ANOVA   TWO-­‐WAY  ANOVA   • • •







Factorial  =  every  possible  combination  of  levels   Example:  what  might  determine  a  client’s  satisfaction  with  therapy?  Maybe  type  of  therapy,  maybe  the   experience  of  the  therapist,  or  maybe  the  combination  of  these  two.   A  two-­‐way  ANOVA  has  2  independent  variables   o In  a  2  ×  2  design  (4  groups  in  total),  each  independent  variable  has  2  levels  –  but  two-­‐way  ANOVAs   can  also  be  2  x  3,  4  x  6,  etc.  –  there  are  J  x  K  groups  in  total   In  a  two-­‐way  ANOVA,  there  will  be  three  effects  "  three  hypotheses:   1. The  effect  of  one  IV  (e.g.,  type  of  therapy)  on  the  DV,  e.g.  “is  there  a  difference  in  client  satisfaction   between  group  and  individual  therapy?”   2. The  effect  of  the  other  IV  (e.g.,  therapist  experience)  on  the  DV,  e.g.  “is  there  a  difference  in  client   satisfaction  between  new  and  experienced  therapists?”   3. The  interaction  between  the  two  IVs,  e.g.  “does  the  difference  between  new  and  experienced   therapists  depend  on  type  of  therapy?”  OR  “does  the  difference  between  group  and  individual   therapy  depend  on  therapist  experience?”   Interaction  indicates  that  the  influence  of  one  IV  on  the  DV  changes  accordingly  to  the  level  of  another  IV   o Independent  of  the  main  effects   o It  is  different  from  the  sum  of  the  individual  effects;  it  qualifies  them   Estimating  the  components  of  variance  (partitioning  variance):  

 

• • •

   

 

  Critical  F  is  based  on  df(effect)  (separate  for  each  effect)  and  df(error)  (the  same  for  all  effects)   Effect  size:  accounted  for  "  SS(A)/SST  +  SS(B)/SST  +  SS(AxB)/SST  (unaccounted  for  "  SSW/SST)   Advantages:     o ANOVA  can  find  interaction  effects   o ANOVA  is  more  economical  (regarding  sample  size)   o ANOVA  can  account  for  more  variability  of  the  DV     ! Suppose  we  had  just  looked  at  the  effects  of  one  IV  (one-­‐way  ANOVA)  with  the  same  overall   means  for  the  DVs.  Total  SS  and  SS  for  the  effect  is  still  the  same,  so  η2  for  this  effect  is  still   the  same   ! But  now  within-­‐group  variability  is  a  lot  higher  because  the  other  IV  and  the  interaction   accounted  for  a  lot  of  the  variance,  and  is  now  unaccounted  for;  is  considered  error  variance   ! Larger  MSW  (within-­‐group  variance/individual  differences/error)  =  less  power  

A person’s score

THE  M ANOVA MODEL  THE  ANOVA ODEL:  EXAMPLE     A person’s score

Overall mean

=

Any systematic group effect

+

Overall mean

=

Yij = µ + Score on DV for ith person in the Individual jth group error

j

Grand mean

DV SCORE Yij

+

Any systematic group effect

+

+

Individual error

+

ij

OMNIBUS F TES

Effect parameter for group j

Random error for ith person in the jth group

Yij = µ +

Yij = µ + Score on DV for ith person in the jth group

Grand mean

 

j

+

i Effect parameter 1 for group j J

ij

j

+

ij

j=1 j=2 Random error for ith person in 1 15 the jth group 7

16

are there syst

j 1 Simply because:

2

+

J

Important 0 in ANOVA F=test All we have to do to calculate the DV score (which is usually the only data feature: we have) Omnibus is to addjeach part of the ANOVA model together: Yij = µ +

j

= µj - µ

Ho:

j=3 15

   

If there are NO systematic effects of IV o

j

1

=

2

=

3

= …. =

J

=

If there are systematic effects of IV on D

13

H : not H OR means not all

o If  there   are  NO   systematic  effects   of  IV  on  DV,  then  all  αj  =  0  !  Ho:  α1  =  α2  =  α3  =....=  αJ  =  0   A Important feature: 3j =0 11 14 10 If  there  are  systematic  jeffects   of  IV  on  DV,  then  NOT  all  αj  =  0  !  HA:  not  Ho  OR  means  not  all  equal  OR  αj  not  all  0   1 4 9 possible  treatments   11 10 EXAMPLE:   So,  because: imagine  tjhat   are  three   of  anxiety   (no  treatment,  CBT,  and  group   Simply = µtj here   -µ In order to make inferences about these two possibilities, we calculate an observed F therapy).  Our  DV  is  ‘freedom  from  anxiety’  (where  higher  score  =  less  anxiety)  and  we  know  that:   ratio, 7 14 which is the ratio 12 of the between=groups 11 variability to within groups variability o 1)  No  treatment  increases  anxiety  by  5  points  !  α1  =  -­‐5   o 2)  CBT  decreases  anxiety  by  3  points  !  α2  =  We 3   can then calculate the probability of obtaining this F ratio, if Ho is true Note that these data look like more realistic psychological there is variability both o 3)  Group  therapy  decreases  anxiety  by  2  points  !  α3  =  2data   because If p value > , then we accept Ho within ( ij ) and between groups (both j and ij ) • Let’s  say  we  also  know  the  population  mean  freedom  from  anxiety  is  10  points:  μ  =  10.  Therefore:   If p value < , then we reject Ho o μ1  =  μ  +  α1  =  10  +  (-­‐5)  =  5   o μ2  =  μ  +  α2  =  10  +  3  =  13   There are a number of steps involved in calculating the F ratio, that can be o μ3  =  about μ  +  αthese +  2possibilities,  =  12   3  =  10   summarised in the following table: In order to make inferences two we calculate an observed F To  elaborate on this and see how we can te ratio, which thew ratio the between groups within variability • isBut   ith  of no   error   term,   this  variability means  ttohat   all  igroups ndividuals   within   a   Source group  score   SS the  same.   df MS F with an omniscient example where we kno • Error   w as   b uilt   i nto   t he   A NOVA   m odel   t o   m ake   t he   m odel   r ealistic   ( Y   =   μ   +   α   +   ε ).   A ssumptions:   ij j ij We can then calculate the probability of obtaining this F ratio, if Ho is true Y Y Between SSB = n dfB = J 1 MSB = SSB/dfB F = MSB/MSW Note that this never happens in real life…in o 1)  Errors  are  independent     If N p ormally   value > d,istributed   then we accept Ho estimate theMSW systematic effects = o 2)     Y Y Within SSW = dfW = J(n 1) SSW/dfW o 3)   a  m<ean   of  zero,   i.e.    ΣH (εoij)  =  0     If Hpave   value , then we reject But, starting with an example where we kn Y Y o 4)  Homogeneity  of  variance  (homoscedasticity)   Total SST = dfTexactly = N 1 how the ANOVA model works There are a number steps involved theisFaratio, When we have access o of Which   can   bine  calculating sall ummarised   s:  εijthat  ~to  Ncan  scores (0,  beσ2εon  )   the dependent variable, in order to get to a summarised in the following table:position where we can make inferences about whether or not there are systematic • When  all  we  have  is  access  to  scores  on  the  DV,  in  order  to  get  to  a  position  where  we  can  make  inferences   Source SS df we first need MS to estimate Fthe population parameters effects, about  whether  or   not  there   are  systematic   effects,  we  first  need  to  estimate  the  population  parameters  

• • •

OMNIBUS F TEST (ANOVA)

OMNIBUS F TEST (ANOVA)

AN EXAM

J

j

2

j

j 1

n

ESTIMATES

J

2

ij

j

i 1j 1

n

J

2

ij

i 1j 1

J

Between

SSB = n j n

Within

Yj

Y

2

j 1 J

SSW =

Yij Yj

2

i 1j 1

n

Total

SST =

J

Yij Y i 1j 1

2

J 1 the MSB = SSB/dfB MSB/MSW TodfB do=this, ANOVA ModelF =can be re expressed in terms of variability Yij = j ij MSW dfW = J(n 1) (YSSW/dfW = (µj µ) + (Yij µj) ij µ) dfT = N 1 Total variability

Systematic between group variability

Unaccounted for within group variability

The population mean, µ, and group means, µj can be estimated from observed data = grand sample mean

j

= sample group mean

So we can obtain estimates of the population variability via: Yij eij j (Yij ) = ( j ) + (Yij j) Total variability

• • •

Systematic between group variability

Unaccounted for within group variability

  But,  deviation  scores  sum  to  zero,  so  these  scores  need  to  be  squared  first.     We  can  then  work  out  SST,  SSB  and  SSW  using  the  formulas   We  can  then  use  these  SS  to  variance  estimates  (mean  squares,  MS)  by  dividing  by  the  relevant  degrees  of   freedom,  i.e.  SS/df