Yes, in a way but when does a bar chart become a histogram? • The bars will have no spaces between them as they represent continuous data • The area of the bar is proportional to the frequency
Why use a histogram? • Preferable for large data sets as stemand –leaf diagrams will become too unwieldy • Used for representation of continuous data that has been grouped into class intervals • But BEWARE………..
Things to watch out for….. • When raw data is grouped into class intervals, the exact values of the data will be lost. • Histograms can be manipulated to give a false impression of the shape of the distribution, dependent on the class interval size.
How to draw a histogram • Group the raw data into class intervals. A minimum of 5 and a maximum of 10 classes is recommended. • Draw up a frequency table (also known as a grouped frequency distribution), indicating the class boundaries and class width. • Frequency density, i.e. height of bar is frequency divided by class width.
What are class boundaries? • These are the actual endpoints of the classes and depend on rounding of the raw data. E.g. height measured to nearest cm class intervals of 160 – 164, 165 – 169 class boundaries will be 159.5 ≤ h < 164.5 • Class width = Upper class boundary – lower class boundary in this case, class width is 5
Special cases…. • If the last class interval is open ended, assume the class width is twice the size of the previous one. • Discrete data can represented on a histogram. Use class boundaries of –0.5 and 9.5 for a class interval of 0-9 even if this seems impossible!
What do I do now? • Note shape of distribution – used in hypothesis testing (S2) • Determine spread of data – look for outliers • Locate the centre of the distribution – by eye • Look for other features – such as clustering or gaps