The Normal Curve and Galton's Board
by Paul Trow

The normal curve models the distribution of many common variables in the physical and social sciences. The graph below shows the standard normal curve with mean 0 and standard deviation 1. 



In the nineteenth century, Sir Francis Galton, one of the pioneers of statistical theory, invented a mechanical device that illustrates how the normal distribution arises from the contributions of many independent random events.

Galton's device consists of an array of pins mounted on a vertical board, as shown by the blue markers in the diagram below.  




The diagram has 6 horizontal rows of pins, or levels. Galton's real board, which still exists in University College, London, has many more levels. When the board is operated, a sequence of balls drops onto the pin at the top of the array. When a ball hits a pin, it bounces to the left or right with equal probability, and then falls down one level, where it hits one of the two closest pins. After the ball reaches the last level, it falls into one of the bins below the bottom row.


Click here to see an animation of Galton's board. If you have Mathcad 14 installed on your computer, you can download a Mathcad 14 file, in which you can run simulations of Galton's board with more levels of pins.




Note that this is a recorded animation that doesn't change when you replay it. The last section of this article shows how to use Mathcad to create different random simulations of Galton's board.

As more and more balls stack up in the bins, the stacks start to take on a predictable shape - that of a normal curve. It is a remarkable fact that, as you increase the number of levels of pins and the number of balls that drop, the distribution of balls (suitably rescaled) approximates a normal distribution. 


Here is a picture of Galton's actual board, built in 1873. The "balls" dropped in the board were actually lead shot.



There is a working model based on Galton's board on display at the Boston Museum of Science.


The Binomial Distribution


You can use Mathcad's rbinom function to simulate Galton's board. rbinom(m, n, p) returns a vector of length m, whose entries are random integers between 0 and n having a binomial distribution. Think of each entry as the result of  tossing a coin n times and recording how many heads appear, assuming that each toss has probability p of coming up heads.


In the special case when p = 1/2 and n = 1- that is, a fair coin is tossed just once - the entries of rbinom(m, 1, 0.5) are 0 or 1 with probability 1/2. If m is the number of levels in Galton's board, this vector defines a random path for a ball. The ball bounces to the left for each 0 and to the right for each 1.


Number of levels in the board



The following program generates a random path and computes the position of the ball at each level. Since the ball bounces a distance 0.5 to the left or the right, the change in the x-coordinate is cointosses - 0.5.





In Mathcad, you can generate different random paths by clicking the program for (xcoord cointosses) above the graph and selecting Tools > Calculate > Calculate Now or press [F9] on the program.


Suppose you want to simulate the results of dropping many balls in Galton's board. You could repeatedly generate random paths, using rbinom as above, and count  the numbers of balls that land in each bin. But if you don't care what paths the balls take, there's a faster way to simulate the numbers of balls in the bins.   


Notice that if you number the bins as shown above, a ball lands in bin k if  the random vector that generates its path contains exactly k ones. So the bin number of a random ball has a binomial distribution with parameters n = levels and p = 0.5. This is exactly the distribution of the entries of the random vector



So this vector simulates the results of dropping m balls and recording the numbers of the bins in which they fall.


As an example, use this vector to simulate the results of dropping 1000 balls in a board with 10 levels.


Number of balls dropped



The following Mathcad program counts the number of balls in each bin. Note that the number of bins is one more than the number of levels.


Zero function


Here is a bar graph of the results.




If you divide these results by the total number of balls dropped, the resulting fractions approximate a binomial distribution, by the Law of Large Numbers. 
  



Compare this graph with the probability density function of the binomial distribution, which is computed by the Mathcad function 





The Normal Approximation to the Binomial Distribution

The examples above show why the balls in Galton's board have the shape of a binomial distribution. But what does this have to do with the normal curve? The answer is that, for large n, the distribution of a binomial random variable X, with parameters n and p = 0.5, is approximated by a normal distribution  having the same mean and standard deviation as X. This is a special case of a famous result called the Central Limit Theorem.

A binomial random variable X with parameters n and p = 0.5  has mean n/2 and standard deviation .

The Mathcad command



computes the probability density function of a normal distribution with this mean and standard deviation, for n = levels.


The following graph shows this normal distribution, together with the corresponding binomial distribution.



This explains why the balls in Galton's board stack up in the shape of a normal curve. The bin numbers of the balls have a binomial distribution, and this in turn is approximated by a normal distribution.

Creating Random Simulations of Galton's Board


This section explains how to create random simulations of Galton's board using Mathcad.

First, define the number of levels of pins in the board by the variable "levels." Define the number of balls that are dropped by the variable "number_balls." You can change these values to create different simulations.
 



Set up board


The following program creates the paths of the balls and and records the numbers of balls in each bin. After recording an animation of the simulation, you can generate a new random simulation by clicking the program below and pressing [F9] to recalculate. 



The following code creates functions that compute the positions of the balls and the number of balls in the bins at each time step t. FRAME is a system variable used to create the animation.
 






Recording an Animation

To record an animation of a simulation:

1.  On the Tools menu, select Animation > Record.

2.  In the Record Animation dialog, set the To: field to the number of frames you want to record. This number should be less than levels + 1 times the number of balls. The At Frames/Sec: field controls the speed of the animation.

3.  In the worksheet, select a region containing the graphs below by clicking above and to the left of the pins, and then dragging the mouse below and to the right of the bins.

4.  Click Animate.



After you have recorded the animation, you can play it by clicking the play button in the Play Animation window. To save it as an avi file, click Save As.

Try creating different animations by changing the values of "levels" and "number_balls" at the beginning of this section. Also, you can generate new random simulations by clicking the program for (xpath bins) and pressing [F9].



Download the Mathcad 14 file
for this article.




Math and Science Articles
Hosted by www.Geocities.ws

1