Wednesday, 20 January 2016

Central Limit Theorem



In the previous post, I introduced the concept of the normal distribution, as the shape of the frequency distribution of the outcome of a number of rolls of a number of die, as both numbers tend towards infinity.  This sounds like a very abstract concept, of little applicability to everyday life.  But in fact, if you look at the distribution of heights in a population, it’s quite accurately described by a normal distribution. So are results from intelligence tests.  Even the batting averages of test cricketers who’ve scored over 2000 runs (a relatively small group of men) seem approximately normally distributed, although the relationship is less perfect.  How come?  There was a theoretical basis for the distribution of die rolls.  But surely a batsman’s ability cannot be modelled in such a regular way. A batsman’s ability may depend on many factors: strength, hand-eye coordination, coaching, attitude, etc.; and there’s no a priori reason for expecting the distribution of such factors to vary normally.

The key concept here is something called the Central Limit Theorem, which states that, if one combines two independent and approximately normal distributions, the combined distribution of both variables will be more normal than the individual distributions.  Thus, one doesn’t need the individual factors that contribute to a batsman’s average to be good fits to a normal distribution, provided there are many such factors contributing to the output, for that output itself to be fairly normal.  There’s a theoretical proof of the theorem, but the ubiquity of approximately normal distributions in nature seems to be good empirical evidence in addition. Indeed, the distribution is named precisely because it’s normal to find it.

As the outcome of cricket series also depends on many independently varying factors, we can also assume that these are normally distributed; or to be precise, that the distribution of the outcomes of a large number of series each comprising a large number of games between two teams of fundamentally constant ability will be normally distributed.  If the two teams are of equal ability, the mean of the distribution – which, as the distribution is symmetrical, is also its central point and the peak of bell curve – will correspond to a result of 0.5 points per game (remember, we assign 1 point to the team that wins a match, and a half point to both teams if there’s a draw – so the mean outcome is equality, which is in fact what we mean when we stated the premise of “equal ability”). And if the difference between two teams is such that the expected outcome is in fact 2/3, we might still expect a normal distribution of results –only now the peak of the graph comes at 2/3, not at 1. 

But now we have a theoretical model for the outcomes of games.  And fitting this model allows us to answer the question we had, if a ratings gap of X corresponds to an expected outcome of 2/3 in favour of the strongest side, what does a ratings gap of 2X imply in terms of outcome? And what emerges from this is the value of the parameter B in the formula we introduced, where the expected outcome of the strongest team is equal to 1 / (B-D + 1), where D is the ratings difference between the two teams.  As we saw, the value of B will determine the answer to the question; and the value that emerges from this analysis is our little friend e, this peculiar irrational number which, among other things, is part of the formula describing the normal distribution. A pretty decent mathematical explanation of this can be found here; note also that in chess, empirical evidence has suggested that in spite of the theory above, a normal distribution may in fact not be the best way ofmodelling actual results.  But B = e in the game of Go; and we’ll take that value for use in our system.

So we can now predict the outcome of any game between two teams with a defined rating; and compare the actual outcome to that predicted.  But how much should an aberrant result change our future expectations?  That requires us to set a new parameter, which we’ll call K.  And K will be the subject of the next post.

No comments:

Post a Comment