SCHULICH SCHOOL OF BUSINESS YORK UNIVERSITY SESSION:
FALL 2006 Final Examination
NAME:
COURSE NO:
OMIS1000 / OMIS2000
I.D. # :
COURSE TITLE: PROFESSORS:
Statistics for Management Decisions H. Cohen, O. Kaminer, A. Marshall, D. Nevo, S. Nevo
NUMBER OF PAGES: LENGTH OF EXAMINATION: EXAMINATION AIDS ALLOWED:
11 pages (NOT including this cover page) 180 minutes (3 hours) Calculator; formula sheet supplied with text
INSTRUCTIONS: Please place your I.D. card on your desk. Count the pages to be certain that there are no pages missing. Write all and only your final answers in the space provided. We will not mark answers given elsewhere. You are not allowed to use your own paper for rough work. You may only use the spaces on the exam. Round all final answers to 4 decimal place. You may assume that all populations described in this test are normally distributed. If alpha is not given in the question use 0.05 You are not allowed to leave the examination room until one hour after the start of the exam and you must sign the sign-in sheet before leaving. Your examination paper must be handed in before you leave. When you are finished please leave the exam room quietly. Cheating on an examination will result in an “F” grade in this course and possible suspension from the University. Do not remove the staple from the exam. Do not write in the mark summary table below.
Part I – Problem Recognition
24 marks
Part II – True/False
17 marks
Part III – Output Interpretation
12 marks
Part IV – Problems
Question 1
6 marks
Question 2
12 marks
Question 3
10 marks
Question 4
9 marks
Question 5
10 marks
Sub-total TOTAL
47 marks 100 marks
OMIS1000 / OMIS2000
Statistics
Fall 2006 Final Examination
Part I - Problem Recognition (24 marks, 3 marks each) Instructions: For each of the scenarios below, write the null and alternative hypotheses and indicate the most appropriate statistical test. You are not required to conduct the test, just choose the most suitable test from the following options: z test; t test; χ2 test; F test, ANOVA. Notes: (1) Be specific in your answers: specify the type of ANOVA required (one-way, randomized block, or two-way), or whether you use the t-test for equal variances or not; (2) When writing hypotheses for two population tests use meaningful indexing. For example, if the question asks to compare Canada and the US, write µc-µu as opposed to µ1-µ2; (3) Write all relevant hypotheses for each of the questions.
1 A ski company in Whistler owns two ski shops, one near . Whistler and one near Blackcomb. The following data were collected from both stores:
Mean sales Sample std. Dev. Sample size
Whistler shop $328
Hypotheses (H0 and HA):
Blackcomb shop $435
$104
$151
35 days
30 days
Test:
The company would like to test for a difference in daily average goggle sales between the two stores
2 The state lottery office claims that the average household . income of those people playing the lottery is greater than $37,000. They also know that the distribution of these households’ income is normal with a standard deviation of $5,756. To test their claim a sample of 25 households was studied. It was found that the average income in the sample was $36,243.
Hypotheses (H0 and HA):
Test:
3 In random samples of 1000 people in the United States and . in France, 70% of the people in the Unites States and 75% of the people in France indicated that they were positive about the future economy. Does this provide strong evidence that the people in France are more optimistic about the economy?
Hypotheses (H0 and HA):
Test:
4 The distributor of the post, a regional newspaper serving . North York is considering three types of dispensing machines or racks. Management wants to know if the different machines affect sales. These racks are designated as J-1000, D, and UV-57. Management also wants to know if the placement of the racks either inside or outside supermarkets affects sales. Each of six similar stores was randomly assigned a machine and location combination, and data were collected on the number of papers sold over four days.
Hypotheses (H0 and HA):
Test:
Page 2 of 11
OMIS1000 / OMIS2000
Statistics
Fall 2006 Final Examination
5 A pasta chef was experiencing difficulty in getting brands of . pasta to be cooked just right. The main problem she experiences is with the speed of water absorption by the different pasta brands. Pasta with a faster rate of water absorption has a greater tendency to be overcooked. She decides to conduct an experiment in which two brands of pasta, one Canadian and one Italian, were cooked for either 4 or 8 minutes. The variable measured was the speed of water absorption in each case. The results were then recorded an analyzed.
Hypotheses (H0 and HA):
Test:
Hypotheses (H0 and HA): 6 A large milling machine produces steel rods to certain . specifications. The machine is considered to be running normally if the standard deviation of the diameter of the rods is 0.15 millimeters. As line supervisor, you need to test to see whether the machine is operating normally. You take a sample of 25 rods and find that the sample standard deviation is 0.19. Test:
Hypotheses (H0 and HA): 7 Are medical students more motivated than law students? A . randomly selected group of each were administered a survey of attitudes toward life, which measures motivation for upward mobility. The scores are summarized below (higher scores mean greater motivation). Medical Students 250 83.5 11.2
Sample Size Mean Score Pop. Std. Dev.
Law Students 100 80.2 9.2
Test:
Hypotheses (H0 and HA): 8.
In a recent survey, college students were asked the amount of time (in hours) they spend weekly watching television and surfing on the Internet. The researchers were interested in determining whether the time spent on both activities was equal. They collected the following data: Person # 1 Internet 2 TV 4
2 7 15
3 3 5
4 8 3
5 9 4
6 15 4
7 7 4
8 2 8
Test:
Page 3 of 11
OMIS1000 / OMIS2000
Statistics
Fall 2006 Final Examination
Part II – True/False (17 marks, 1 mark each) Instructions: Next to each question mark an “X” in the column for ‘True’ of ‘False’. True
False
1. In a simple regression model, if the regression model is deemed to be statistically significant, it means that the regression slope coefficient is significantly greater than zero. 2. In a hypothesis test, the p-value measures the probability that the alternative hypothesis is true.
3. If a hypothesis test is conducted for a population mean where only non-negative values can be sampled, a null and alternative hypothesis of the form: H0 : μ = 100, Ha : μ ≠ 100, will result in a one-tailed hypothesis test since the statistic can only assume nonnegative values.
4. Two variables have a correlation coefficient that is very close to zero. This means that there is probably no relationship between the two variables.
5. All other things held constant, increasing the level of confidence for a confidence interval estimate for the difference between two population means will result in a wider confidence interval estimate.
6. The method used in regression analysis for incorporating a categorical variable (no. of categories = 5) into the model is by organizing the categorical variable into five dummy variables.
7. In a recent one-way ANOVA test, Mean SSW was equal to 1,590 and the Mean SSB was equal to 310. Therefore, SST is equal to 1,900.
8. A local medical center has advertised that the mean wait for services will be less than 15 minutes (but more than 0 minutes). Given this claim, the hypothesis test for the population mean should be a one-tailed test with the rejection region in the lower (lefthand) tail of the sampling distribution.
9. Consider the following regression equation: ŷ = 356 + 18.0x1 – 2.5x2. The x1 variable is a quantitative variable and the x2 variable is a dummy with values 1 and 0. Given this, we can interpret the slope coefficient on variable x2 as follows: holding x1 constant, if the value of x2 is changed from 1 to 0, the average value of y will increase by 2.5 units. 10.The coefficient of determination measures the percentage of variation in the independent variable that is explained by the dependent variables in the model.
11.
A perfect correlation between two variables will always produce a correlation coefficient of +1.0. 12.The prediction interval developed from a simple linear regression model will be at its narrowest point when the value of x used to predict y is equal to the mean value of x.
13.
When testing a hypothesis about the variability of a population, the statistical requirements call for us to convert the variance to standard deviation and run a chisquare test.
14.
When the expected cell frequencies are smaller than 30, the cells should be combined in a meaningful way such that the expected cell frequencies do exceed 30.
15.
If it is known that a simple linear regression model explains 56 percent of the variation in the dependent variable and that the slope on the regression equation is negative, then we also know that the correlation between x and y is approximately (0.75). 16.In estimating the difference between two population means, if a 95 percent confidence interval includes zero, than we can conclude that there is a 95 percent chance that the difference between the two population means is zero.
17.
In a multiple regression analysis, even if only some of the independent variables have values equal to zero, the regression intercept, b0, can still be meaningful.
Page 4 of 11
OMIS1000 / OMIS2000
Statistics
Fall 2006 Final Examination
Part III – Computer Output Interpretation (12 Marks, 2 mark each) Random samples of two freshman, two sophomores, two juniors, and two seniors each from four dormitories were asked to rate on a scale from 1 (poor) to 10 (excellent) the quality of the dormitory environment for studying. The results are shown in the table.
Year Freshman Sophomore Junior Senior
D1 7 6 5 7
5 8 4 4
Dormitory D2 D3 8 6 9 8 5 5 7 8 7 6 6 7 6 8 7 5
D4 9 8 7 6
9 9 8 7
Given the following ANOVA table, answer questions 16 through 21: SUMMARY Freshman Count Sum Average Variance
D1
D2
D3
D4
Total
2
2
2 A B C
2
8
ANOVA Source of Variation Sample Columns Interaction Within
SS 10.59375 20.34375 16.03125 18.5
df 3 3 9 16
MS 3.53125 6.78125 1.78125 1.15625
F 3.054054 5.864865 1.540541
P-value 0.058694 0.006706 0.215963
Total
65.46875
31 Answers:
16. What is the value of A? 17. What is the value of B? 18. What is the value of C?
19. Can we conclude that the effect of the four dormitories is uniform across all students’ groups? a. Yes, and therefore we have test separately for the individual effect of dormitories and student groups. b. Yes, and therefore we can go ahead and interpret the individual effect of dormitories and student groups from the above table. c. No, and therefore we have test separately for the individual effect of dormitories and student groups. d. No, and therefore we can go ahead and interpret the individual effect of dormitories and student groups from the above table. 20. What is the smallest alpha for which you can reject the null hypothesis for differences between the four student groups? a. .05 b. .025 c. .01 d. I am unable to reject the null hypothesis for all these values. 21. What is the smallest alpha for which you can reject the null hypothesis for differences in the four dormitories? a. .05 b. .025 c. .01 d. I am unable to reject the null hypothesis for all these values.
Page 5 of 11
OMIS1000 / OMIS2000
Statistics
Fall 2006 Final Examination
Part IV - Short Answer Questions (47 Marks, Individually Weighted) Question 1 (6 marks) Traditionally, a professor likes to assign grades according to the following breakdown: 15% A’s, 25% B’s, 40% C’s, 15% D’s, and 5% F’s. This year, she gave out 17 A’s, 35 B’s, 60 C’s, 10 D’s and 2 F’s. Is there any evidence that the professor has changed her grading scheme? (Use α=0.05)
Your answer:
Page 6 of 11
OMIS1000 / OMIS2000
Statistics
Fall 2006 Final Examination
The following refers to questions 2 and 3: A professor of business statistics teaching a large lecture wanted to study scores on the three exams that are given during the semester. The exams each cover one portion of the semester and are not cumulative. The results for a sample of 33 students were as follows:
Student 1 2 3 4 5 6 7 8 9 10 11
I 89 80 86 68 88 89 82 89 42 61 84
Exam II 80 68 76 77 95 66 83 86 58 54 84
III 74 74 83 71 85 65 88 54 52 62 51
Student 12 13 14 15 16 17 18 19 20 21 22
I 56 67 99 82 75 58 56 55 72 73 79
Exam II 71 55 95 45 71 44 50 14 59 80 68
III 68 48 77 73 64 52 14 25 75 70 84
Student 23 24 25 26 27 28 29 30 31 32 33
I 63 89 62 74 62 70 65 82 91 84 95
Exam II 43 80 23 92 57 51 78 53 90 83 88
III 56 77 48 68 55 61 70 58 96 67 90
The professor has also calculated the following sample statistics: Exam I Mean Standard deviation
Exam II
Exam III
74.7576
67.1818
65.3030
13.7887
20.0007
17.3286
Page 7 of 11
OMIS1000 / OMIS2000
Statistics
Fall 2006 Final Examination
Question 2 (2*6 marks=12 marks) (a)
At the 0.05 level of significance, is there evidence of a difference in the students’ grades on exam II and exam III? (b) Students complained that exam II was much more difficult than exam I. The professor decides to test this by looking at the variances within the two exams. A higher variance in grades usually indicates a more difficult exam. At the 0.05 level of significance, are the students correct?
Your answer:
Page 8 of 11
OMIS1000 / OMIS2000
Statistics
Fall 2006 Final Examination
Question 3 (10 marks) The professor would like to use ANOVA to test for differences in the averages of all three exams. Construct the appropriate ANOVA table and analyze the data accordingly (you do not have to find p-values). The following values have already been calculated: SSB = 1653.4141 SST = 30147.3535 MSW = 107.03
Your answer:
Page 9 of 11
OMIS1000 / OMIS2000
Statistics
Fall 2006 Final Examination
Question 4 (3*3 marks = 9 marks) A company in Maryland has developed a device that can be attached to car engines that they believe will increase the miles per gallon that cars will get. The owners are interested in estimating the difference between mean mpg for cars using the device versus those that are not using the device. The following data represent the mpg for random samples of cars from each population. With Device
Without Device
22.6
26.9
23.4
24.4
28.4
20.8
29.0
20.8
29.3
20.2
20.0
26.0 28.1 25.6
Circle the correct answer below (you may use any blank space for calculations) (a) Given this data, what is the critical value if the owners wish to have a 95 percent confidence interval estimate? a. b. c. d.
t = 2.1788 t = 1.7823 z = 1.96 None of the above.
(b) What is the upper limit for a 95 percent confidence interval estimate for the difference in mean mpg? a. b. c. d.
Approximately 3.8 mpg About 5.4 mpg Just under 25.0 None of the above.
(c) Which of the following statements is true? a. Given the sample information, using 95 percent confidence, we can’t conclude that a difference exists in the population mean mpg between vehicles that use the new device versus vehicles that do not use the new device. b. The sample information produces a 95 percent confidence interval that leads us to believe that a difference does exist between the population mean mpg between vehicles that use the new device versus vehicles that do not use the new device. c. The sample sizes used are too small to produce a confidence interval estimate that could have any value in reaching a decision about the two engine devices. d. None of the above.
Page 10 of 11
OMIS1000 / OMIS2000
Statistics
Fall 2006 Final Examination
Question 5 (10 Marks) The Manager of Material Handling was analyzing the factors that influence the times that it takes to unload trucks at the warehouse loading dock. She has had a multiple regression models run, with time as the dependent variable using the following independent variables: Boxes - Number of Boxes Weight - Total weight of the boxes (in hundreds of kilograms) Experience - Years of experience of the person unloading the truck The standard deviation of the unloading times is 15.7665 minutes. The partial Excel™ output is:
Regression Statistics Multiple R 0.90547204 R Square 0.81987962 Adjusted R Square 0.80813264 Standard Error 6.90615778 Observations 50 ANOVA df
SS
MS
F
Significance F
Regression
3.77654E-17
Residual 12,180.58
Total
Intercept Boxes Weight Experience
Coefficients Standard Error -29.3899869 6.749432996 0.59672985 0.05454831 0.35986252 0.083109499 0.25659422 0.142467554
t Stat -4.35444 10.93947 4.329981 1.801071
P-value 7.38E-05 2.17E-14 7.99E-05 0.078249
Required: a.
Fill in the missing values in the ANOVA table above. (4 marks)
b.
Write out the equation of the true regression model. (2 marks)
c.
Which variables should be retained and which should be dropped (at the 0.05 sig. level)? (2 marks) Retained
d.
Dropped
Suppose that the manager believes that the shift (Day, Evening, or Night) influences the unloading time. Write out the new estimated regression equation model. (2 marks)
Page 11 of 11
OMIS1000 / OMIS2000
Statistics
Fall 2006 Final Examination
Spare page
Page 12 of 11
OMIS1000 / OMIS2000
Statistics
Fall 2006 Final Examination
Page 13 of 11