Wednesday, 6 January 2016

The Eternal Greatness of Australian Cricket Culture



The aim of cricket is to win matches. Australia have won more matches than anyone else in the history of test cricket (370). But one could say the aim is to win as many matches as possible, without losing.  Which team has won more than they have lost by the largest margin? Again, Australia, whose wins are offset by only 208 losses, for a positive balance of +162.  But teams that play more matches are going to win more, and good teams that have played a lot will have a larger win-loss difference, even if they win (and lose) at the same frequency.  So perhaps one should award one point to any team that wins a game, half a point to each side if the match is tied or drawn, and then measure the average points won per game played.  Which team has the best average on this metric? Also Australia, who average 0.63 points per game, as opposed to the 0.54 points per game of England, their nearest rivals (to get these numbers, I used the indispensable Statsguru feature of espncricinfo.com, which makes these sorts of analyses possible).

So fairly indisputably, Australia have been the most successful team in test cricket.  But even then they have not won every match.  And of course, the outcomes of individual matches are unpredictable (if they weren’t, there’d be no point in playing them).  Players get lucky, find form and fitness, pitches suit one side or the other, there’s a critical umpiring mistake, and so on. We can say that the outcome of each individual match is influenced by random factors.  But random factors are not sufficient to explain the observed patterns.  When I ran five simulations, assigning results to matches at random (but sticking to the overall pattern actually observed whereby approximately one third of games are drawn and the remainder are won by either side), no team averaged more than 0.58 points per game. Moreover, random fluctuations tend to cancel each other out with larger samples. For England and Australia (who’ve played most matches), the five simulations never once generated an average total of  higher than 0.52 points per game, or lower than 0.48.  

One explanation could be that Australia mostly play weak opponents, increasing their win-loss ratio above average.  In fact, I suspect that the opposite is true, but for the moment, let’s just assume that this isn’t the explanation for Australia’s very high score. Instead, let’s build a model, where the match outcome (O) is determined by A – B + R, where A is a measure of the underlying strength of team A, B is the strength of their opponents (obviously, it’s the difference between the two that determines a side’s chance of winning), and R represents randomly fluctuating factors.  What could be the factors determining A and B?  Population size, the strength of the cricketing culture of a country (i.e. how dominant it is compared with other sports), national wealth and so on might all play a part. And this model is good as far as it goes.  Except that, besides fundamentally static qualities, and random fluctuations, there are many factors which do vary, but only slowly over time. For example, great players enter, and retire from, the national side; coaches are appointed and are sacked. In fact, even the supposedly static qualities I’ve previously highlighted will themselves change, just at a slower rate (the strength of cricket in the national culture is not guaranteed in the same way as a physical constant). And when we say “Australia is a great side”, we don’t actually mean that Australia have been, on average, more successful over the course of history, even though this is true.  What we actually tend to mean is that particular Australian teams have been very strong, and have won a lot of matches.  But there have also been low ebbs in Australian cricket, in which for lengthy periods the team has averaged less than 0.5 points a match.  We don’t have to change the structure of our model to accomodate this; but the fact that A and B are no longer assumed to be static creates some problems.  Specifically, we can’t assume that Australia are now a great team merely because they won a lot of matches in the 19th century, or even the 20th.  What might inform us better is asking, how many matches have Australia won recently?  Obviously, because a team’s ability level is always changing, such a backwards-looking assessment will not tell us exactly how good a team is right at this moment in time.  But because we assume that A and B change slowly and continuously (and indeed, we have factored out match-to-match fluctuations into a separate variable to make this so), we can assume that recent performances are a reasonable guide to the present day. But how many to include?  This will be the subject of the next post.

One final thought.  The only true facts are match results. To demonstrate that a team’s victory was 10% due to the eternal greatness of its cricketing culture, 60% due to the semi-persistent talent of its current eleven, and 30% due to random luck, is quite possible in an appropriately constructed model; but it’s not actually a fact about the real world, just a quirk of the way we have chosen to conceive it.  In reality, there are 4000 of so cricketing elevens that have started a test match, and each one has obtained a particular result.  As humans, we want to fit this into a narrative, to assert “X was a great side in the 1980s”, even though dozens of different players may have played for team X in this decade and the many great victories achieved might have been interspersed by the odd truly shocking defeat.  When we say “X is the best team playing now” that implies we would back X at even odds in a game against any other side; but if it was easy to analyse the numbers and find out which way to bet for certain, someone would have made their millions doing so now. And if a system doesn’t provide an answer that matches our subjective prejudice, who’s to say if the system is measuring the wrong things, or if we are simply refusing to believe that our instincts could be wrong?  Nonetheless, next time, we’ll being to look at some ways a system could be constructed, and some basic problems in statistics that we'll encounter as we do so.

No comments:

Post a Comment