Stat 201: Homework 3

Report 3 Downloads 93 Views
Stat 201: Homework 3 Ryan Kelley September 27, 2016 Chapter 2 Problems 2.75 a) Five Number Summary: • • • • •

Minimum: 60 (million BTU) Q1: 130 Median: 160 Q3: 250 Maximum: 665

b) 0.46 Standard Deviations below the mean of 195 (z=-0.46) c) 1.16 Standard Deviations above the mean of 195 (z=1.16) 2.82 a) 11.8 b) The distribution for Central and South America is skewed-right because the spread of values between the median and Q3 is greater than the spread of values between Q1 and the median, as shown in the box plot. In addition, the whisker extending from Q3 to the maximum value (excluding potential outliers) is longer than the whisker from Q1 to the minimum value. c) Per Capita Carbon Emissions are much larger for Europe than Central/South America. In fact, the first quartile for Europe is higher than third quartile for Central/South America. In addition, the distribution of Central/South America is skewed-right, while the distribution of Europe is slightly skewed-left. 2.122 a) Range: 11.9 Interquartile Range (IQR): 5.8 b) Because the minimum value of 79.9 falls within 8.7 units (IQR•1.5) of the first quartile of 84.0 and the maximum value of 91.8 falls within 8.7 units of the third quartile of 89.8, a box plot would not show potential outliers for the given distribution. In short, all values fall within the whisker lines, as determined by the caluculation IQR multiplied by 1.5. c) The z-score for an observation is defined as the number of standard deviations that it falls from the mean. Because the standard deviation is 3.4, no observations have a z-score greater than 3 in absolute value. 3 standard deviations from the mean would range from 76.7 to 97.1, while the minimum value is 79.9 and the maximum value is 91.8.

1

Linear Transformations a1) Verify y¯ = a + b¯ x after applying the linear transformation yi = a + bxi to observations x1 , x2 , . . . , xn n

y¯ =

n

1X 1X yi = (a + bxi ) n i=1 n i=1 1X 1X a+ bxi n n 1X xi y¯ = a + b n y¯ = a + b¯ x

y¯ =

a2) Verify sy = bsx after applying the linear transformation yi = a + bxi to observations x1 , x2 , . . . , xn r

1 X (yi − y¯)2 n−1 1 X (yi − y¯)2 sy 2 = n−1 1 X (a + bxi − a + b¯ x)2 = n−1 1 X sy 2 = (bxi + b¯ x) 2 n−1 1 X (xi − x ¯ )2 sy 2 = b2 n−1 sy =

sy 2

sy 2 = b2 sx 2 sy = bsx b) Conversion from degrees Fahrenheit to degrees Celsius:

y¯ = a + b¯ x=

−160 5 + (98.9) = 37.17◦ C 9 9

sy = bsx = (5/9)(1.3) = 0.73

Chapter 3 Problems 3.3 a) Reponse Variable: Happiness Explanatory Variable: Income b) Given the conditional propotions on happiness at each level of income (see R code and table below), the percentage of people with above average incomes who identified as “Very Happy” is 16.13 percentage points higher than people with below average incomes. In short, people with higher incomes are more likely to be happy. On the other side, the percentage of poeple with below average incomes who identifed as “Not Too Happy” is 17.85 percentage points higher than people with above average incomes. In short, people with lower incomes are more likely to be unhappy.

2

Table=matrix(c(21, 213, 126, 96, 506, 248, 143, 347, 114), nrow=3, ncol=3, byrow=TRUE) rownames(Table)=c("Above Average", "Average", "Below Average") colnames(Table)=c("Not Too Happy", "Pretty Happy", "Very Happy") prop.table(Table, 1) ## Not Too Happy Pretty Happy Very Happy ## Above Average 0.05833333 0.5916667 0.3500000 ## Average 0.11294118 0.5952941 0.2917647 ## Below Average 0.23675497 0.5745033 0.1887417 c) By dividing the number of people who reported being “Very Happy” by the total number of people in the study, we can find the marginal proportion of people who identified as “Very Happy.” Dividing 488/1814, we find the margin proporition to be 26.9% or 0.269. We can also use R: Table=matrix(c(21, 213, 126, 96, 506, 248, 143, 347, 114), nrow=3, ncol=3, byrow=TRUE) rownames(Table)=c("Above Average", "Average", "Below Average") colnames(Table)=c("Not Too Happy", "Pretty Happy", "Very Happy") addmargins(Table)

## ## ## ## ##

Above Average Average Below Average Sum

Not Too Happy Pretty Happy Very Happy Sum 21 213 126 360 96 506 248 850 143 347 114 604 260 1066 488 1814

3.3 a) Reponse Variable: Binge Drinker Status Explanatory Variable: Gender b) Male and Binge-Drinker: 1,908 Female and Binge-Drinker: 2,854 c) No you cannot because you do not know (based on just these two cells) what percentage of surveyed students were male and what percentage were female. In short, these cells are not propotions. d) Conditional Proportions of Binge-Drinking Status Based on Gender: Table=matrix(c(1908, 2017, 2854, 4125), nrow=2, ncol=2, byrow=TRUE) rownames(Table)=c("Male", "Female") colnames(Table)=c("Binge Drinker", "Non-Binge Drinker") prop.table(Table, 1) ## Binge Drinker Non-Binge Drinker ## Male 0.4861146 0.5138854 ## Female 0.4089411 0.5910589 e) The contingency table above tells us that males are 7.72 percentage points more likely to be binge drinkers than females. We cannot yet say whether this is statistically significant, but it nonetheless appears that males are more likely to be heavy drinkers.

3