Other Normal Distributions

Report 6 Downloads 107 Views
Chapter 11

Lesson

Vocabulary

Other Normal Distributions

11-3

standardizing a variable

BIG IDEA A z-score transformation of raw data will transform any normal distribution into the standard normal distribution.

Transforming Any Normal Curve into the Standard Normal Curve

Mental Math

Recall that if each number in a data set is translated by a constant, the mean of the data set is also translated by that constant, but the standard deviation remains unchanged. For instance, consider the fact that the heights of adult men in the Netherlands are approximately normally distributed with mean μ = 183 cm and standard deviation σ = 6.7 cm. If μ = 183 cm is subtracted from each height, the mean of the resulting data set is 183 cm - 183 cm = 0 cm. There is no change in the standard deviation. QY

QY

If each translated height is divided by 6.7 cm, the standard deviation of x - 183 maps the resulting data set is 1. Thus, the transformation x → _ 6.7 the data set of heights with μ = 183 cm and σ = 6.7 cm, whose distribution is at the right in the graph below, onto a data set with mean μ = 0 cm and standard deviation σ = 1 cm, whose distribution is at the left in the graph below. 0.4 Relative Frequency

Calculate the z-score of a score of 83 on an exam on which the mean is 80 and the standard deviation of scores is 4.

What heights are within 1 standard deviation of the mean height of men in the Netherlands?

y

y=

1 2π e

2 –x 2

0.2 y = 6.7

–1 2

1 2π

e

x – 183 6.7

2

z -3

x

3

162.9

x - 183 6.7

176.3

183

189.7

196.4

203.1

Height (cm)

Standardized Height z=

169.6



x

But notice that this transformation maps x onto its z-score! The GraphStandardization Theorem shows that the distribution of z-scores, shown above, is the standard normal curve. The same argument could be used with any mean and any standard deviation, leading to the following theorem.

698

Normal Distributions

SMP_SEFST_C11L03_698_705_FINAL.i698 698

6/11/09 2:09:00 PM

Lesson 11-3

Standardization Theorem If a variable x has a normal distribution with mean μ and standard deviation σ, then the variable x-μ z=_ σ

has the standard normal distribution. The process of getting z-values from an original data set by applying x -μ the transformation x → _ σ is often referred to as standardizing the variable. By standardizing the domain of normal distributions, you can determine probabilities by using the procedures in Lesson 11-2. Caution: the Standardization Theorem applies only to sets of raw data that are normally distributed. The z-score transformation of data preserves the shape of the raw-score distribution. If the original distribution is not normal, then the z-score transformation will not provide the standard normal distribution, even though the z-score mean is 0 and standard deviation is 1.

Standardizing Variables to Find Probabilities Many measurements of humans provide normal distributions in appropriate populations. When adults in a country are separated by gender and ethnicity, men’s heights and women’s heights each form distributions that are modeled by a normal distribution.

Example 1 Use the information given on page 698. An adult Dutch male is selected randomly. What is the probability that the man is less than 188 cm tall? Solution Because the set of heights x of adult Dutch males has an approximately normal distribution with a mean of 183 cm and standard x - 183 deviation of 6.7 cm, the variable z = _ has a standard normal 6.7 188 - 183 distribution. The z-score for x = 188 is z = _ ≈ 0.75. This 6.7

indicates that a height of 188 cm is about 0.75 standard deviation above the mean of 183 cm. Note that the two curves on the next page are centered around their mean values, 183 and 0, respectively. The two shaded areas in the graphs are equal. Because each area represents a probability, the two probabilities are equal. (continued on next page)

Dutch men, seen here in a cattle market in the Netherlands, tend to be taller than men in most other countries.

Other Normal Distributions

SMP_SEFST_C11L03_698_705_FINAL.i699 699

699

6/11/09 2:09:35 PM

Chapter 11

Relative Frequency

0.4

y

y=

2 –x 1 2 2π e

0.2 y = 6.7

1

–1 2

2π e

x – 183 6.7

2

z -3

3 .75 Standardized Height

x 162.9

169.6

176.3

183

189.7 188 Height (cm)

196.4

203.1

From the table of values for the standard normal distribution, P(x < 188) ≈ P(z < 0.75) ≈ 0.7734. The probability of a randomly selected adult Dutch male being less than 188 cm tall is about 0.77. In other 3 words, just over _4 of all adult males in the Netherlands are less than 188 cm tall. Check Use a calculator to compute standard normal distribution probabilities. The first line at the right shows the answer for transformed data. The second line shows the command applied to raw data.

In Example 2, we want to find the probability that the amount of popcorn in filled popcorn packets lies in a particular interval.

GUIDED

Example 2 A machine fills bags with popcorn. If the bags are underfilled, there will be complaints from consumers. If the bags are overfilled, they cannot be closed and sealed by the next machine in the production line. The bag states that there are 7 ounces in a bag. The company has chosen to fill each bag with 8 ounces of popcorn on average. Assume the quantity of popcorn in each bag is normally distributed with a standard deviation σ = 0.6 ounces. a. What percent of the bags are likely to be under the stated weight of 7 ounces? b. If a manager purchases empty bags that can hold 10 ounces fully filled, then what proportion would be filled over capacity? c. What proportion of bags would contain between 7 and 10 ounces of popcorn? d. If a manager decides that at most 1% of bags should be overfilled, then what should be the capacity of the bag used in the packaging process?

700

One of the oldest American foods Archeologists have found 4,000 year-old ears of popcorn in New Mexico.

Normal Distributions

SMP_SEFST_C11L03_698_705_FINAL.i700 700

6/11/09 2:10:26 PM

Lesson 11-3

Solution 1 a. Sketch a diagram like the one below. Mark the value for 7 ounces on the right graph and shade to the left. Relative Frequency

y 0.8 0.6 0.4 0.2 z -3

-2

-1

0

1

2

x

3

6.2 6.8 7.4 8 8.6 9.2 10 7.0

Standardized Weight

Weight (ounces)

7-? To find the z-score, calculate _ . So z ≈ ? . Mark this value on the ? left graph and shade to the left. Consult the Standard Normal Distribution Table to find the probability that a bag is underweight. P(x < 7) = P(z < ? ) ≈ ? b. P(x > 10) = P(z > ? ) ≈ 1 - ? ≈ ? c. In Part a you found the lower tail and in Part b you found the upper tail. So in this part, P(7 < x < 10) = 1 - P(x < ? ) - P(x > ? ) ≈ 1- ? - ? ≈ ? . d. If there is 1% in the upper tail, then the there is ? % below the tail. Look for this percent in the Standard Normal Distribution Table to find that the corresponding z-score is ? . x-8 x-8 ? Since z = _ ; =_ , from which x = ? . So 0.6 0.6 the manager can use ? ounce bags and have at most 1% be overfilled.

Solution 2 Use a statistics utility to find the various probabilities. Note that the solution to Part d uses a calculator command whose arguments are p, μ, and σ, respectively.

Approximating a Binomial Distribution with a Normal Distribution In the binomial probability distribution B with n = 100 and p = 0.2, the random variable has mean μ = 100(0.2) = 20 and standard deviation ____

____________

y .1 B(x) =

100 x

(0.2)x(0.8)2 - x

Binomial

___

σ = √npq = √100(0.2)(0.8) = √16 = 4. At the right is the graph for this binomial distribution. The curve for the normal distribution with μ = 20 and σ = 4 is superimposed on the graph. For these values of n and p, the normal distribution appears to approximate the binomial distribution quite well.

f(x) =

1 4 2π

-1

e2

x – 20 4

2

x 10

20

30

40

Other Normal Distributions

SMP_SEFST_C11L03_698_705_FINAL.i701 701

Normal

701

6/11/09 2:11:20 PM

Chapter 11

In general, if n is quite large, or when p is close to 0.5, a normal curve approximates a binomial distribution quite well. But, if n is small and p is near 0 or 1, then the binomial distribution is not well approximated by a normal distribution. A rule-of-thumb is that a binomial distribution can be approximated ____ by a normal distribution with mean np and standard deviation √npq provided np and nq are each at least 10. The graphs of binomial distributions in Chapter 10 provide evidence for this approximation. We return to the situation of testing whether or not a coin is fair. Notice how the use of the normal approximation to the binomial distribution allows us to deal with the large numbers of trials that are more realistic in real tests.

Example 3 A coin is tossed 100,000 times and 50,482 tails result. Using the 0.01 significance level, test the hypothesis that the coin is fair. Solution Write the null hypothesis first. Ho: The coin is a fair coin.

The number of tails is binomially distributed. If the coin is fair, the probability of a tail is p = 0.5. In this experiment, n = 100,000. The experiment has both np and nq greater than 10, so a normal distribution can approximate the binomial. The binomial random variable has mean np = 50,000 and —— — standard deviation √— npq = √ 100,000 · 0.5 · 0.5 = √ 25000 ≈ 158.11. Change to z-scores to model the binomial distribution with a standard normal 50,482 - 50,000 distribution. The z-score equivalent of 50,482 is __ , 158.11

which is about 3.049. That is, 50,482 is about 3.049 standard deviations above the mean. Find P(z ≥ 3.049). To use the Standard Normal Distribution Table, round the z-score to the nearest hundredth. The table indicates P(z < 3.05) ≈ 0.9989. So

P(z ≥ 3.05) ≈ 1 - 0.9989 = 0.0011 .

If the coin is fair, then the probability of tossing 50,482 or more tails in 100,000 tosses is 0.0011. Additionally, the probability of 50,482 or more heads is also 0.0011 and is included as in Lesson 10-8. The total probability is 0.0022, which is less than the significance level 0.01, so reject the null hypothesis Ho. The coin appears to be biased towards tails.

Notice how much simpler the calculation is for the normal approximation than is the calculation for the exact answer using the binomial distribution. The exact answer would require calculating a sum of hundreds of terms!

702

Normal Distributions

SMP_SEFST_C11L03_698_705_FINAL.i702 702

6/11/09 2:12:02 PM

Lesson 11-3

Questions COVERING THE IDEAS 1. Multiple Choice Each of the data sets graphed below can be converted to z-scores. Which has a z-score distribution that can be approximated by the standard normal distribution? A B

10 20 30 40 50 60 70 80 90 100 110

10 20 30 40 50 60 70

C

D

10 20 30 40 50 60 70 80

10 20 30 40 50 60 70 80 90 100

In 2–6, use this information. In Serbia/Montenegro, adult male heights have x− = 186 cm and s = 6.0 cm. In Vietnam, x− = 163 cm and s = 6.7 cm. Assume heights in each country are normally distributed. x - 186 2. If the transformation x → _ is applied to each data point for 6.0 Serbia/Montenegro, what type of distribution results? 3. What proportion of men are less than 180 cm tall in a. Serbia/Montenegro? b. Vietnam? 4. Out of 200 male employees of a company in Serbia/Montenegro, how many can the coach of the company basketball team expect to be taller than 198 cm? 5. One airline’s flight attendants must be 152.4 to 190.5 cm tall. What percent of Vietnamese men are likely to be an appropriate height? 6. Give an equation for the normal distribution of adult male heights in each country.

Reading the news in Vietnam

7. The advertisement on a package of peanuts says: “Money back guaranteed if you have an underweight packet.” Suppose the amount of peanuts is normally distributed with a mean of 8 ounces and standard deviation of 0.2 ounces. What amount should be printed on the package so that only 1% of packets are under the advertised weight?

Other Normal Distributions

SMP_SEFST_C11L03_698_705_FINAL.i703 703

703

6/11/09 2:12:27 PM

Chapter 11

8. Suppose that the length  of a speech is normally distributed with mean 3 minutes and standard deviation of _1 minute. 2 a. Identify a transformation that can map each  to a z-score. b. What are the mean and standard deviation of the z-scores?

9. A coin is tossed 100,000 times and 50,297 heads result. At the 0.05 level, test the hypothesis that the coin is fair.

10. Use the np ≥ 10 and nq ≥ 10 criteria to decide whether the binomial distribution with the given number of trials and probability of success can be modeled by a normal distribution. a. n = 100, p = 0.89 b. Example 3 in Lesson 10-8

APPLYING THE MATHEMATICS 11. Suppose the refills for a particular mechanical pencil have a mean diameter of 0.5 mm. Refills below 0.485 mm in diameter do not stay in the pencil, while those above 0.520 mm do not fit in the pencil at all. If a firm makes refills with diameters that are normally distributed with a mean of 0.5 mm and a standard deviation of 0.007 mm, what is the probability that a randomly chosen refill will a. be too large? b. be too small? c. fit correctly?

12. A theme park’s rides are rated as mild, moderate, and max. They have restrictions requiring that passengers have heights of at least 42 inches, 48 inches, and 54 inches, respectively. Suppose the population of children attending the park has a mean height of 53 inches with a standard deviation of 4 inches. a. What percent of children can go on all rides? b. What percent of children can participate in only mild and moderate rides? c. If a child is chosen randomly, what is the probability that that child can only go on mild rides? d. What percent of children are excluded from all rides?

Pencil It In The first mechanical pencil mechanism was patented in Britain in 1822 by Sampson Mordan and John Isaac Hawkins.

13. At the 0.01 level, test the hypothesis that 85% of New Yorkers believe traffic congestion is a problem (p = 0.85). Use a poll of 1162 people conducted by Quinnipiac University, which found that 1057 New Yorkers believed that traffic congestion is a problem. State any conditions necessary to compute your conclusion. In 14 and 15, assume that the time for a seed to germinate is normally distributed with a mean of 14 days and a standard deviation of 2 days. 14. What percent of seeds germinate within 17 days? 15. What is the probability that a randomly chosen seed will not have germinated within 21 days?

16. A coin is tossed 500 times. Suppose the coin is fair. a. Find the probability that there are between 243 and 257 heads. b. Find the probability that the number of heads differs from 250 by more than 25. 704

Normal Distributions

SMP_SEFST_C11L03_698_705_FINAL.i704 704

6/11/09 2:13:13 PM

Lesson 11-3

17. The makers of the SAT reported that among the 1,465,744 SAT test takers in 2006, 454,705 actually scored from 500 to 590 on the mathematics section. Suppose that the makers of the SAT did not publicize this data but we knew the data to be normally distributed. Test takers of the SAT know that scores are rounded to the nearest ten points (meaning a score such as 518 is not possible). a. In 2006, the scaled scores on the SAT mathematics section were reported to have mean 518 and standard deviation 115. From this information, estimate the probability that a randomly selected student in 2006 had an SAT mathematics score from 495 to 594. (We use these numbers because we are making an estimate using a continuous model rather than the discrete values actually used by the SAT). b. Calculate the residual between your estimate and the actual relative frequency of students who scored between 500 and 590.

REVIEW In 18 and 19, suppose the variable z has a standard normal distribution. 18. Find the area of the shaded region at the right. 19. Determine P(z ≥ 2.67). (Lesson 11-2)

f(z)

z

20. Fill in the Blank In a standard normal distribution, 95% of the observations are within (Lesson 11-2)

?

standard deviations of the mean.

1.65

21. How many combinations of four faculty and two students can be formed from a group of 20 faculty and 300 students? (Lesson 10-1) In 22 and 23, suppose that a system on a spacecraft is composed of three independent subsystems, X, Y, and Z. The probability that these will fail during a mission is 0.002, 0.006, and 0.003, respectively. (Lessons 6-3, 6-2, 6-1) 22. If the three are connected in series, as shown at the right, a failure in any one of the three will lead to a failure in the whole system. What is the probability that the system will be reliable—that is, it will not fail? 23. Suppose the subsystems are connected as shown at the right. In this case, both X and either Y or Z must be reliable for the whole system to be reliable. What is the probability that this whole system will not fail?

24. Consider the set {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}. (Lessons 1-6, 1-2)

X

Y

Z

X

Z

Y

a. Find the mean of the numbers in the set. b. Find the standard deviation of the numbers in the set.

EXPLORATION 25. Some books give the following rule of thumb for finding the

range

standard deviation of a normally distributed variable: s ≈ _ . 6 Explain why this rule of thumb works.

QY ANSWER

176.3 cm to 189.7 cm Other Normal Distributions

SMP_SEFST_C11L03_698_705_FINAL.i705 705

705

6/11/09 2:13:43 PM