Chapter 1
Lesson
Vocabulary
Creating and Using Histograms
1-3
distribution histogram bins frequency histograms
Histograms display data and show how frequently
BIG IDEA
relative frequency histograms
values occur.
skewed symmetric population pyramid
Joint session of Congress on January 8, 2009, to tally electoral votes
The number of representatives that each state has in the U.S. Congress depends on the population of the state. Below are the numbers of representatives from each of the 50 states in the 110th Congress (2007–2009). 7
7
6 15
2
4
2
1
3
5
1 11
1
5
5
3 53 13
3 29
5
9
9
4
1
2
7
8
3
8
4 13 18
8 25 19 10
9
2
1
2
3
6
1
13 19 32
8
9
Mental Math a. How many thousands are in 1 billion? b. How many millions are in 1 trillion?
1
This data set can be turned into a distribution by finding the frequency of each data value. A distribution is a function whose values are the frequencies, relative frequencies, or probabilities of mutually exclusive (non-overlapping) events. By graphing a distribution, you may see features of the data that are hard to see in a table. One important type of graph is a histogram, which is a special type of bar graph. A histogram breaks the range of a numerical variable into non-overlapping intervals of equal width, which are called bins. Frequency histograms display the number of values that fall into each interval. Relative frequency histograms display the percent of values that fall into each interval. Two histograms showing the congressional data are on the next page. The histogram on the left shows frequency; the one on the right shows relative frequency.
22
Exploring Data
SMP_SEFST_C01L03_022_030_FINAL.i22 22
3/23/09 3:48:09 PM
Lesson 1-3
We chose to use bins of width 5 and to make each interval include its left endpoint but not its right. For instance, on each graph the leftmost two intervals are 0 ≤ x < 5 and 5 ≤ x < 10, so a state with exactly 5 representatives is recorded in the second bar from the left. Number of Representatives in Congress
24
0.48
20
0.4
Relative Frequency
Frequency
Number of Representatives in Congress
16 12 8 4
0.32 0.24 0.16 0.08 0
0 0 5 10 15 20 25 30 35 40 45 50 55
Number of Representatives from a State
0 5 10 15 20 25 30 35 40 45 50 55
Number of Representatives from a State
Example 1 Refer to the histograms on congressional data. a. About how many states have 15 to 19 representatives? b. What percent of states have 20 or more representatives? c. In what bin does the median number of representatives fall among the 50 states? Solution
a. Use the frequency histogram above on the left. Read the height of the bar spanning 15 ≤ x < 20. Four states have from 15 to 19 representatives. b. To find the percent of the states rather than the number of states, use the relative frequency graph. Read the bars to right of x = 20. 0.04 + 0.02 + 0.02 = 0.08. So 8% of the states have 20 or more representatives in Congress. c. When the 50 values are rank-ordered, the median is between the 25th and 26th values. From the frequency histogram, we conclude that there are 20 states with fewer than 5 representatives and 17 with between 5 and 10 representatives. So the 25th and 26th values are between 5 and 10, and the median must be in the bin between 5 and 10.
Drawing a Histogram To make a histogram, first organize the data into non-overlapping intervals of equal width. Choosing the bin width is a matter of judgment. There is usually not a single best size. Generally, choosing 5 to 10 intervals is about right for a histogram. Too few bins will lump all the data together; too many will result in only a few numbers in each bin. Creating and Using Histograms
SMP_SEFST_C01L03_022_030_FINAL.i23 23
23
3/23/09 3:48:29 PM
Chapter 1
Second, count the number of observations per bin and record the results in a table that gives the frequency or relative frequency for each of the bins created. Finally, draw the histogram by first marking the horizontal axis to show the endpoints of the bins and marking the vertical axis to show the frequencies (or relative frequencies). Then, for each interval, draw a bar to represent the frequency. Unlike other bar graphs, histograms are drawn with no gaps between bars unless an interval is empty, in which case its bar has height 0. In the Activity you will make a histogram using technology.
Activity Below are the daily high temperatures in March one year in Lincoln, Nebraska. 69, 60, 34, 41, 36, 44, 27, 45, 43, 49, 71, 67, 64, 54, 43, 40, 42, 58, 61, 68, 56, 45, 45, 64, 61, 60, 49, 51, 58, 53, 42 Step 1 Enter the temperatures into a statistical package or spreadsheet. Step 2 Use the technology to make a histogram, using 5 as the bin width. Label your axes as shown. Step 3 Change the bin width from 5 to 2. How does this affect the graph? Step 4 Change the bin width from 2 to 10. How does this affect the graph? Which of the three bin widths (2, 5, or 10) gives the best description of the data? Step 5 Return the bin width back to 5. Adjust the settings on your technology so that the vertical axis displays relative frequency rather than frequency. In what ways is this graph different from the one in Step 2?
Analyzing Histograms
• Is the distribution skewed?
(A) 14 12
Frequency
Histograms help you to see features of a data set that are hard to capture from a table. The graphs shown here illustrate three different shapes that are common. Histogram (A) at the right shows a skewed distribution. It has a cluster of high temperatures on the right side of the graph. Histogram (B) on the next page also shows a skewed distribution, but it tapers off toward the right end to form a tail. Histogram (C) is close to being symmetric, with two sides that are approximately the same shape. So, to describe the shape of a distribution, consider the following questions:
10 8 6 4 2 0
20
30
40
50
60
70
80
Temperature
• Does the distribution have a tail? If so, at what end? • Is the distribution symmetric?
24
Exploring Data
SMP_SEFST_C01L03_022_030_FINAL.i24 24
3/23/09 3:48:50 PM
Lesson 1-3
(C)
(B) 7 6
20
Frequency
16 12 8
5 4 3 2 1
4
Number of Representatives
0 43 0 44 0 45 0
42
0
38
0 36
0 5 10 15 20 25 30 35 40 45 50 55 60
0 39 0 40 0 41 0
0
0
37
Frequency
24
Number of Customers
Population Pyramids Demographers use histograms to analyze populations. One display is a double histogram called a population pyramid. Two examples of population pyramids are shown below. United States: 1980 Male
Female 85+ 80-84 75-79 70-74 65-69 60-64 55-59 50-54 45-59 40-44 35-39 30-34 25-29 20-24 15-19 10-14 5-9 0-4
16 14
12
10
8
6
4
2
0
0
2
4
6
8
10
12 14
16
Population (in millions) United States: 2020 Male
Female 85+ 80-84 75-79 70-74 65-69 60-64 55-59 50-54 45-59 40-44 35-39 30-34 25-29 20-24 15-19 10-14 5-9 0-4
16 14
12
10
8
6
4
2
0
0
2
4
6
8
10
12 14
16
Population (in millions) Source: U.S. Census Bureau, International Data Base
Creating and Using Histograms
SMP_SEFST_C01L03_022_030_FINAL.i25 25
25
3/23/09 3:49:11 PM
Chapter 1
A population pyramid is made from two separate histograms that have been rotated 90º. The bin intervals are placed along a central vertical axis and the frequencies are on the horizontal axis. The examples on the previous page show age distributions for men and women. The first is from actual U.S. Census Bureau data. The second is a projection, based on what demographers expect to happen.
GUIDED
Example 2 Refer to the population pyramids on the previous page. a. For 1980, estimate which age group had the greatest population. b. For 2020, compare the distribution for men with the one for women. In what age groups are they most different? c. Compare the 1980 and 2020 population pyramids. Describe three significant differences. Solution
a. For 1980, the longest bars are for people age 20–24. The 20–24 age group had the greatest population in 1980. b. There appears to be about the same number of men and women in most age brackets. Starting at about age ? , there are clearly more females than males. This difference is very pronounced for the 85+ age group. c. First, the 1980 population pyramid has a bulge in the 15 to ? age range that is not as pronounced in the 2020 pyramid. Second, more people are projected to live longer in 2020. In 1980 there are ? million people 80 or older, while in 2020 there are ? million over 80 years old. Third, the total population in ? is bigger than the population in ? .
QUIZ YOURSELF
According to the 1980 population pyramid on the previous page, for preschool children (age 0–4), are the numbers of boys and girls equal?
See Quiz Yourself at the right.
Questions COVERING THE IDEAS
3. Here is a list of heights (in inches) of 15 students:
73, 66, 63_34 , 68_34 , 70, 69, 65, 71, 68, 80_12 , 64, 67, 65, 64, 64_12 .
a. Draw a histogram of the data. b. What bin width did you use? Explain your choice.
26
Length of Songs 40 35
Frequency
In 1 and 2, the graph at the right shows the lengths of songs that Jessie has stored on an MP3 player. 1. Use two to three sentences to describe the distribution. 2. For this histogram, the bin width is 0.5. About how high would the leftmost bar be if the bin width were 1?
30 25 20 15 10 5 0
0
2
4
6
8
Time in Minutes
Exploring Data
SMP_SEFST_C01L03_022_030_FINAL.i26 26
3/23/09 3:49:22 PM
Lesson 1-3
SAT Subject Test: Literature
Frequency
In 4–8, use the histogram at the right of the scores of students taking the SAT Literature Subject Test of the College Entrance Examination Board in 2007. 4. About how many students scored from 500 to 549? 5. What is the bin width? 6. Approximately how many students took the Literature exam in 2007? 7. About what percent of the students scored 700 or better? 8. In what interval is the median score?
20000 18000 16000 14000 12000 10000 8000 6000 4000 2000 0
0 0 0 0 0 0 0 0 0 0 0 0 0 20 25 30 35 40 45 50 55 60 65 70 75 80
Score Source: The College Board
In 9 and 10, refer to the graphs of heights of African American males. Heights of African American Males Born 1980-1989
Height (cm)
35% 30% 25% 20% 15% 10% 5%
15 0 15 5 16 0 16 5 17 0 17 5 18 0 18 5 19 0 19 5 20 0 20 5 21 0
Relative Frequency
35% 30% 25% 20% 15% 10% 5%
15 0 15 5 16 0 16 5 17 0 17 5 18 0 18 5 19 0 19 5 20 0 20 5 21 0
Relative Frequency
Heights of African American Males Born 1920-1929
Height (cm)
Source: National Center for Health Statistics
9. Use the histograms to estimate which bin contains the median height for each group of African American males. Which group has the larger median height? 10. a. About what percent of African American males born in 1920– 1929 were at least 190 cm (about 6'3") tall? b. About what percent of African American males born in 1980– 1989 were that tall?
APPLYING THE MATHEMATICS 11. At the right are some data regarding accidental deaths in the U.S. in 2004. a. Create a histogram of the data on drowning. b. At what ages is there the highest risk of drowning? c. What is likely to make the histogram in Part a misleading? In 12–14, describe a possible shape for the distribution of the variable. Explain your reasoning. 12. the mint dates of U.S pennies that are in a store’s cash register 13. the salaries of employees in a large company 14. the scores on an easy test.
Age
Drownings
under 1 yr.
62
1–4
430
5–14
269
15–24
574
25–34
385
35–44
435
Source: National Safety Council
Creating and Using Histograms
SMP_SEFST_C01L03_022_030_FINAL.i27 27
27
3/23/09 3:49:40 PM
Chapter 1
15. Cynthia and Ralph are playing a board game. Cynthia
Rolls of a Die
suspects that the die they are using is biased. She decides to test the die by rolling it 100 times and counting the number of times 1, 2, 3, 4, 5, or 6 occurs. A distribution of the outcomes of this experiment is shown at the right. a. Use two sentences to describe the distribution. b. About what percent of the time did a 1 show up? c. Do you think that the die is fair? Justify your answer.
Frequency
50 40 30 20 10 0
16. Suppose you are part of a committee to plan projects that will
1
2
3
4
5
6
Number on Die
help India’s population in the future. Using the population projections below, order these three developments in terms of importance, and explain why you have picked the order you made. (1) Build more elementary and high schools. (2) Stimulate development of factories and other workplaces. (3) Build more trains and roads. India: 2010 Male
Female 100+ 95-99 90-94 85-89 80-84 75-79 70-74 65-69 60-64 55-59 50-54 45-59 40-44 35-39 30-34 25-29 20-24 15-19 10-14 5-9 0-4
70
60
50
40
30
20
10
0
0
10
20
30
40
50
60
70
Population (in millions) India: 2050 Male
Female 100+ 95-99 90-94 85-89 80-84 75-79 70-74 65-69 60-64 55-59 50-54 45-59 40-44 35-39 30-34 25-29 20-24 15-19 10-14 5-9 0-4
70
60
50
40
30
20
10
0
0
10
20
30
40
50
60
70
Population (in millions) Source: U.S. Census Bureau, International Data Base
28
Exploring Data
SMP_SEFST_C01L03_022_030_FINAL.i28 28
3/23/09 3:49:49 PM
Lesson 1-3
REVIEW 17. Monday through Friday a store averaged $2,100 a day in sales. On Saturday and Sunday the store took in $7,200 total in sales. What is the store’s mean sales per day? (Lesson 1-2)
18. In his Biology class, Mr. Boynton covered six chapters in the second semester. He gave two quizzes and a chapter test for each chapter, and a final at the end of the semester. Connie averaged 91 on the quizzes, 73 on the chapter tests, and 85 on the final. a. Mr. Boynton weights chapter tests twice as much as quizzes, and the final five times as much as each chapter test. Compute Connie’s weighted average score in Biology. b. What would Connie have had to score on the final in order to average at least 85 for the semester? (Lesson 1-2) 7
19.
∑ xi i=1 _ If ∑ xi = 15, ∑ xi = 20, and 7 = 8, find x 6 · x 7. (Lesson 1-2) 6
5
i=1
i=1
20. Describe a situation in which the students in your school would be a. a sample.
b. a population. (Lesson 1-1)
21. Match the graphs to their equations. Do not use a calculator. (Previous Course)
y
y
a.
b. x
x
y
y
c.
d. x
i. y = _1x
ii. y = x 2
x
iii. y =
√ x
1 iv. y = _ 2 x
_
v. y = √x 2
Creating and Using Histograms
SMP_SEFST_C01L03_022_030_FINAL.i29 29
29
3/23/09 3:50:00 PM
Chapter 1
EXPLORATION 22. Since 1978, the People’s Republic of China has had a policy that strongly encourages families to limit themselves to one child, or at most two children. In contrast, Russia is concerned about population decline and is encouraging larger families. The population pyramids below show U.S. Census Department projections for these countries in 2010. For each country make a hypothetical population pyramid for the year 2030 assuming that the government policies are successful. What long-term concerns are raised? China: 2010 Male
Female 100+ 95-99 90-94 85-89 80-84 75-79 70-74 65-69 60-64 55-59 50-54 45-59 40-44 35-39 30-34 25-29 20-24 15-19 10-14 5-9 0-4
70
60
50
40
30
20
10
0
0
10
20
30
40
50
60
70
Population (in millions) Russia: 2010 Male
Female 100+ 95-99 90-94 85-89 80-84 75-79 70-74 65-69 60-64 55-59 50-54 45-59 40-44 35-39 30-34 25-29 20-24 15-19 10-14 5-9 0-4
7
6
5
4
3
2
1
0
0
1
2
3
4
5
6
7
Population (in millions) Source: U.S. Census Bureau, International Data Base
QY ANSWERS
no 30
Exploring Data
SMP_SEFST_C01L03_022_030_FINAL.i30 30
4/30/09 1:39:23 PM