1 Introduction This note extends the discussion in Chapter 13 on testing the assumption that a metric variable is normally distributed. In that chapter we stressed the importance of a thorough visual inspection of the shape of the variable’s distribution using histograms and/or box plots. Here we will look at two other approaches mentioned in Chapter 13:
Interpreting skewness and kurtosis statistics
Statistical testing of the assumption of normality
2 Interpreting skewness and kurtosis statistics As we explained in Chapter 13, in addition to visual inspection, you can calculate summary statistics that measure skewness and kurtosis. For both measures, a perfectly normal distribution should return a score of 0. Otherwise:
A positive skewness value indicates positive (right) skew; a negative value indicates negative (left) skew. The higher the absolute value, the greater the skew.
Similarly, a positive kurtosis value indicates positive kurtosis; a negative one indicates negative kurtosis. The higher the absolute value, the greater the kurtosis.
It is more difficult to determine how extreme either the skewness or the kurtosis values must be before they indicate a problem for the assumption of normality.
Specialist statistics packages such as SPSS report a statistic called the standard error for both the skewness and kurtosis scores. This allows a simple rule of thumb to be applied. If you divide either score by its standard error and the result is greater than ±1.96, it suggests that your data are not normal with respect to that statistic. An example of SPSS output for skewness and kurtosis tests from a sample of test scores is given in Figure 1. Skewness and kurtosis are both positive, indicating that the data are slightly right-skewed and peaked (leptokurtic) compared to a normal distribution. Applying the rule of thumb of dividing each value by its standard error (Std. Error), gives 0.76 for skewness and 0.68 for kurtosis, both well within ±1.96 limits, suggesting that the departure from normality is not too extreme. This is confirmed by visual inspection of the histogram of the same data shown in Figure 2. This is a rather crude test as it is affected by the size of the sample (for larger samples, a threshold of ±2.58 can be used) so it should always be complemented by visual inspection. Figure 1 – Skewness and kurtosis (SPSS output)
Figure 2 – Histogram with normal curve plotted (SPSS output)
3 Statistical tests of the assumption of normality You can also use the Kolmogorov-Smirnov (K-S) and the Shapiro-Wilk (S-W) tests to test the assumption that your sample data are drawn from a normally-distributed population. Both require interval data and can be run in SPSS. Both test the null hypothesis that the data come from a normally-distributed population. The alternate hypothesis is therefore that the data come from a population that is not normally distributed. Consequently, if the results of either test are significant (e.g. p Descriptive Statistics > Explore. 2. In the dialogue box, send the variables to be tested to the Dependent List box. 3. Click Plots and select Normality plots with tests. Click Continue. 4. Click OK to run the tests. Output will be in the same form as Figure 3. (For more guidance on using SPSS see the introductory guide on the companion website.)
4 References Field, A. (2013). Discovering statistics using SPSS. 4th ed. London: Sage. Pallant, J. (2013). SPSS survival manual. 5th ed. Buckingham: Open University Press.