# Central Limit Theorem

March 29, 2018

Back to: Random Testing

The central limit theorem states that the sum (or average) of sets of *N *random variables will move toward a normal distribution as *N* increases. The French mathematician Laplace proved this for a number of general cases in 1810 [2]. Laplace also derived an expression for the standard deviation of the average of a set of random numbers, confirming what Gauss had assumed for his derivation of the normal distribution (visit the lesson on Confidence Intervals.)

### Examples of the Central Limit Theorem

The simplest and most remarkable example of the central limit theorem is the coin toss. If a “true” coin is flipped *N* times, the probability of *q* heads occurring is given by Equation 11, which is called the binomial distribution.

(1)

Equation 11

Figure 3.10 plots a histogram of the binomial distribution in comparison to the normal distribution for 6 coin tosses. There is good agreement between the two distributions, even with only 6 tosses (although the tails of the normal distribution extend beyond the possible values of *q*). The French mathematician De Moivre noticed this agreement in 1733 and used (2/π*N*)^{1/2} e^{-(2/N)(q–N /2)2} as an approximation for the cumbersome calculation of Equation 11 for large *N*. However, he hadn’t generalized this to other cases.

Another example is the averaging of a random variable, *x*, uniformly distributed between -0.5 and 0.5 (which might be the range of uncertainty in measurement). By averaging 2, 3, and 4 of these random variables, the gradual convergence to the normal distribution can be seen in Figure 3.11.

When the magnitude of the PDF is plotted on a linear scale, it is not clear what is happening at the tails of the distribution. This can be corrected by plotting the magnitude on a logarithmic scale so the large percentage deviation between the average of four uniformly-distributed random variables and the normal distribution can be seen in the tails of the distribution. This is significant in cases where the extreme values of a signal are critical to understanding the behavior of a product under test; for example, fatigue analysis.