Saturday, 23 January 2016

Initialisation



In previous posts, we’ve established how an Elo rating system works, by making predictions based on existing ratings and adjusting those ratings according to the difference between the prediction and the actual outcome.  And we’ve explored how to set some of the parameters needed in the Elo formula.  But the basic concept is one of adjusting already existing ratings. How do we set them in the beginning and kick-off the whole system?

The first thing to note is that the absolute ratings don’t matter, only the difference between the ratings of different teams. So if team A has a rating of 100 and team B has a rating of zero, that’s exactly the same is if team A has a rating of one million, one hundred and team B a rating of exactly a million.  Secondly, the less accurate our initial guess of the relative ratings of two sides, the less accurate predictions based on them will be and thus the faster the ratings will correct, so unless one is very interested in the ability of teams in the early 1880s (England and Australia played the first ever test match in 1877), it doesn’t really matter how we initialise our system.  But we do have to start somewhere, and the cleanest assumption is to favour neither one side nor the other, and to set the initial ratings of both teams to the same value.  I chose zero for that value.  And because all changes to ratings are reciprocal (as one team improves its rating, its rival’s rating falls by the same amount), the sum total of all ratings, and the mean rating, are thus fixed at zero thereafter.

But of course, although the first test involved these two teams only, subsequently eight other teams have entered test cricket.  Most obviously, we would introduce these other teams also with a rating of zero; but there’s a problem.  In general, teams have been assigned test status after their ability has improved to make test matches between themselves and other test-rated countries worth playing.  But at the moment of admission, they’re typically rather weaker than most existing teams.  So giving a new side an initial rating of zero, the mean rating of all the existing teams, seems over-generous.

An alternative might be to add a new team with a rating equal to that of the lowest existing rating (or maybe, a fixed number of points even lower than that). This is probably the correct thing to do in terms of accurately predicting its first results (most teams entering test cricket have lost their early matches).  But this would also have the effect that each time a new team enters test cricket, the mean rating of all teams would fall. The ratings might still be appropriate at each point in time. But an ordinary rating at one point in time might be a strong rating some time later. We’d lost comparability beween eras.

My compromise is to enter each team at a rating equal to the lowest current rating at the time of entry; but then to adjust the ratings of all pre-existing teams upwards in order to restore the average rating to zero. Thus, in 1888, England had a rating of 77, and Australia of -77.  South Africa were about to play their first test.  They get added to the system with a score of -77; and we add 38.5 to the ratings of each of England and Australia to keep the overall average rating to zero.  Because of this adjustment, the new team actually ends up with a lower rating than the new rating of the previously worst-rated side. Note that if England and Australia now play, the prediction of their next game is unchanged by these revisions (the gap between the two teams remains 144 points, just as it was previously, even though both teams now have improved ratings in absolute terms, because they have each improved by the same amount).

But there is a kind of contradiction here.  We add the new team at a low rating on the defensible assumption that they’re probably not yet very good (from which is follows that the average quality of all test teams will therefore have gone down as a result of the new addition).  But but we then adjust the ratings of all the pre-existing teams upwards because we don’t want a deflationary trend. It appears that I’m assuming that all existing teams get better just because a bad team joins them!

There are two answers to this.  One is that no-one could argue that, in the long term, cricket has been weakened by the entrance of the West Indies (or indeed, most other teams to have followed England and Australia into the test arena). Test cricket itself tends to strengthen teams (which is a large part of why teams get admitted when they attain a certain level of promise but while still relatively weak).  Maybe when a new team enters test cricket, the average ability of all test teams is reduced; but only the effect is only temporary.

This is true, but the better answer is that cricket is a test of relative strength. From a set of cricket results, you cannot measure objectively how good a team actually is except in comparison to other teams from its own era.  If team A from era X has a higher rating than team B from era Y, this only shows that team A was more dominant in its era than team B was when it was playing; it tells us nothing about who would win were both sides to be miraculously resuscitated. In 1890 the England side of W.G. Grace and George Lohman had a rating of 99, whereas the current England side rates just 90.  But few would believe that even the good doctor would flourish if suddenly left to face players with modern levels of fitness (although moderns might similarly struggle if asked to play on the kind of pitch that their predecessors had to play on).  Comparing W.G. to say, Ben Stokes, is not really meaningful; nor can we directly compare their teams, except by considering how dominant (or not) they were compared to their contemporaries.  By keeping the average of all ratings at zero, an individual rating is a measure of a team’s superiority or inferiority compared with their average opponent.  And by this measure, adding a new, weaker opponent to the mix does indeed increase the strength of the rest. Thus the decision can be defended; but inter-era comparisons remain difficult, for reasons we will see.

That’s it for teams entering test cricket; in the next post, we’ll look at teams leaving, and, more problematically, re-entering the sport.

No comments:

Post a Comment