Statistical Inference Course Project Part 2

Report 31 Downloads 400 Views
Statistical Inference Course Project Part 2 Anna Bayes May 25, 2016

Loading and Exploring What are the dimensions of the ToothGrowth data? Rows

Columns

60

3

Let’s look at a sample: len

supp

dose

4.2 11.5 7.3 5.8 6.4 10

VC VC VC VC VC VC

0.5 0.5 0.5 0.5 0.5 0.5

Here are the counts of the data for each value of supp: OJ

VC

30

30

Here are the counts of the data for each value of dose: 0.5

1

2

20

20

20

Here’s a summary of the the len variable: Min.

1st Qu.

Median

Mean

3rd Qu.

Max.

4.2

13.08

19.25

18.81

25.28

33.9

A picture is worth a thousand words. Lets plot the distribution of the ‘len’ variable. We can look at it alone and split up by the supp and dose variables:

1

Distribution of 'len'

count

15

10

5

0 10

20

30

len

Distribution of 'len' summarized by 'supp' (rows) and 'dose' (columns) 0.5

1

2

8 6 OJ

4

count

2 0 8 6 VC

4 2 0 10

20

30

10

20

30

10

20

30

len

Compare tooth growth by supp and dose We will use confidence intervals and hypothesis tests to compare tooth growth by supp and dose to see if these 2 variables have any effect on tooth growth. Let’s test if the true difference the in means for the 2 different supp values is 0 H0 : mu_supp_OJ = mu_supp_VC Ha : mu_supp_OJ mu_supp_VC We will do a two-sided t test using a 95% confidence interval. Since the distributions of ‘len’ looked slightly normal and there are only 60 observations total we are going to assume this data follows a T distribution. t.test(ToothGrowth$len ~ ToothGrowth$supp, alternative = "two.sided")$conf.int ## [1] -0.1710156 7.5710156 ## attr(,"conf.level") ## [1] 0.95 2

Since the interval contains 0 we will NOT be rejecting the null hyothesis that the difference in means for the 2 supp values is 0 Let’s test if the true difference in means for the 3 different dose values is 0. Since only 2 groups can be compared at once, we need to create 3 subsets of the data, each with only 2 of the dose values. First let’s compare dose = 0.5 with dose = 1 H0 : mu_dose_.5 = mu_dose_1 Ha : mu_dose_.5 mu_dose_1 TGD.05_1