The distribution of data refers to characteristics used in statistics, such as kurtosis and skewness, which provide detail on the shape of a probability distribution. In statistics and probability, the concept of data being normally distributed is critical to understand and is inextricably linked to the central limit theorem. Essentially, the central limit theorem states that whenever the sample size of a random population is large and the standard deviation and variance of that population is defined, a distribution function can be attached to that population which has a tendency for normality (1). Under the normal distribution, the data points of a random population will always converge at an average value, thus these values will fall within the normal distribution range. Furthermore, as long as the population which is being considered is normal, a sample distribution of the mean will show normality even if the sample size evaluated is small (1).
Considering that in statistics the distribution of data should follow a pattern of normality, being able to test whether a population or a set of data is in fact normally distributed is critical. This is where these characteristics which measure the distribution of data fit in. For example, testing the skewness of a data set is a common method for testing normality. Histograms are often constructed to test whether or not a data set is skewed. If the bars of a histogram display the bell-shaped curve, this illustrates that the data is normally distributed.
Analyzing the distribution of data is integral to the study of statistics since much of this field of study is based on this concept of normality. Therefore, understanding the characteristics which can test for normality is important to being able to interpret data properly and make accurate conclusions.
1. Probability and Statistics for Engineers, Eighth Edition – Prentice Hall 2011