Histograms

Report 18 Downloads 126 Views
Histograms Aren’t they just fancy bar charts?

Yes, in a way but when does a bar chart become a histogram? •  The bars will have no spaces between them as they represent continuous data •  The area of the bar is proportional to the frequency

Why use a histogram? •  Preferable for large data sets as stemand –leaf diagrams will become too unwieldy •  Used for representation of continuous data that has been grouped into class intervals •  But BEWARE………..

Things to watch out for….. •  When raw data is grouped into class intervals, the exact values of the data will be lost. •  Histograms can be manipulated to give a false impression of the shape of the distribution, dependent on the class interval size.

How to draw a histogram •  Group the raw data into class intervals. A minimum of 5 and a maximum of 10 classes is recommended. •  Draw up a frequency table (also known as a grouped frequency distribution), indicating the class boundaries and class width. •  Frequency density, i.e. height of bar is frequency divided by class width.

What are class boundaries? •  These are the actual endpoints of the classes and depend on rounding of the raw data. E.g. height measured to nearest cm class intervals of 160 – 164, 165 – 169 class boundaries will be 159.5 ≤ h < 164.5 •  Class width = Upper class boundary – lower class boundary in this case, class width is 5

Special cases…. •  If the last class interval is open ended, assume the class width is twice the size of the previous one. •  Discrete data can represented on a histogram. Use class boundaries of –0.5 and 9.5 for a class interval of 0-9 even if this seems impossible!

What do I do now? •  Note shape of distribution – used in hypothesis testing (S2) •  Determine spread of data – look for outliers •  Locate the centre of the distribution – by eye •  Look for other features – such as clustering or gaps