THE VERY NEGLECTED SIR FRANCIS GALTON (x-y)

Galton was a cousin of Charles Darwin, which makes one wonder why Darwin didn't learn some statistics. For Galton made several contributions to this field, in particular, the powerful tool of regression analysis. And there is a fascinating story behind the latter.

In Galton's time, it had become known that food from America had apparently increased the average height of the British people.

I became aware of this when I took my small sons, Tim and Chris, on board a replica of the ship which brought Captain John Brown and other Jamestown pioneers to found the ill-fated Jamestown Colony in Virginia. I noticed how small the berths were for the sailors and remarked about this. Our guide, a college student on vaction, noted that the average height of the British sailor in the 17th century was bout 4' 10". A tourist standing beside me asked if that was the case with the Indians they met over here. The guide said "NO", and noted that the Chieftan of the Indians, Powhattan, was 6' 2". The difference was the food of the New World -- corn, wheat, potatoes, yams -- compared to that of the Old World.

However, a peculiar statistical effect had been noted. Although sons would often be taller than their parents, yet they differed less from the average (arithmetic mean) of their generation than their parents did from theirs. Was this a factor in the food which had a diminishing effect with the generations?

Galton settled the matter in a very convincing way. He constructed a normal distribution universe of data. He drew samples from this known universe. First, a sample of a given size, to compare its average (arithmetic) mean with that of the normal distribution universe from which it was drawn. Then another sample, compounded with the previous one to compose a larger sample, to compare its mean with the universe mean. Then another sample, compounded with the previous two samples to compose a still larger sample to compare its mean with the universe mean. Etc.

Result? As the sampling size increased, the mean of the sample differed less and less from the mean of its universe: REGRESSION BACK TO THE MEAN.

CONCLUSION? Since the effect could be explained by a purely statistical procedure, it was foolish to waste time and money trying to find a nonstatistical explanation.

This policy has become known to quality engineers as "engineering-by-exception". Samples are drawn from a production line and the mean number of defectives per sample is calculated. Compounded samples give rise to "universal data", and deviations from the mean are plotted as UPPER and LOWER CONTROL LINES on a chart. If a given daily sample mean defective report falls WITHIN THE CONTROL LINES, the engineer does not bother to investigate "the cause", since the "control region" should cover more than 99% of the cases. Only reports above or below a CONTROL LINE is investigated -- only the EXCEPTION -- engineering-by-exception.

This has inspired management-by-exception in the business world. The manager sets up a procedure which deals routinely with the routine, sparing the manager time for the nonroutine.

Galton is also responsible for the pinball machine! He wished to aquaint the public with understanding as to how binomial change builds the normal distribution curve ("bell curve" ). Here's how you can build a replica of his board.

  1. Take a board about 1 inch by 18 inches by 30 inches and "box" it on all sides by strips extending 1 inch above board surface;
  2. 8 inches from bottom of board, create a row of 1" nails, just fixed into the surface, protruding up, spaced across every inch;
  3. create similar rows of nails up toward top, stopping at least 2 inches from top of board;
  4. at bottom glue or otherwise fix slotting strips (say, 3/8" x 1" x 6") on the 3/8" sides to collect marbles that roll down the board;
  5. finally, use board 1/2" x 1" x 18", driven in, to prop up top of the nail-board.
You roll, one after another, small marbles or small ball bearings from top down the board. When on strikes a nail, it either goes off to the left or to the right to collect in one of the slots at the bottom. (The partition slots prevent balls from sliding off to the sides, leveling the bottom output.) It can be shown that there are many more paths toward the middle of the board than to the sides, so that more marbles collect between the middle slots than between those on the sides.

Furthermore, it can be shown that the distribution of paths matches the BINOMIAL DISTRIBUTION (familiar to algebra students in a Pascal Table), and the BINOMIAL DISTRIBUTION is the DIGITAL or DISCRETE FORM which PASSES OVER INTO THE ANALOGIC FORM known as "The Normal Distribution (bell curve)".

Thus, you see how this "curve" arises.

Galton demonstrated his board in public, under the peculiar name, "The Quincunx". Some one got the idea of attaching a device at the bottom right for shooting marbles to the top of the board, to tumble down through the nails.

Still later, holes were put into the board, so that some marbles never reached the bottom slots. Eventually, The Quincux became The Pinball Machine! A statistical demonstration device of Sir Francis Galton became a game-toy!