Rugby: The Wisdom of Crowds and the Air New Zealand Cup

29 October 2009

In an earlier posting, I looked at how the wisdom of crowds might apply in the case of the Sky Sport Virtual Rugby game to make picks in the Super 14 rugby competition.  In that case, and looking only at outcomes, i.e. win/loss, the crowd had a success rate of 69% over the round robin stage of the competition.  I suggested that this was not a great result.  I might have to revisit that.

The Air New Zealand Cup round robin stage has now been completed, so I have looked at the figures for those games to see if there is any difference.  In the Jimungo Virtual Rugby competition, participants pick the outcome (who will win) and a score level (12 point margin or less; moe than 12 points).  A draw can also be picked.  The published data include the percentage of participants who have picked each possible result, and it is this information, plus the actual outcome, that I have analysed.  What I was interested to find out was the level of predictive success for both outcomes and margins, whether the crowd learned anything over the course of the competition, and whether any teams were more predictable than others.

The Air New Zealand Cup round robin stage is played by 14 teams over 13 weeks.  The teams range from unions which provide a base for Super 14 teams – Auckland, Waikato, Wellington, Canterbury, Otago – to provincial unions, some of which have been struggling in financial terms, and until this year with declining attendances.  An added fillip this season has been the planned restructuring of the competition for next year which is intended to reduce it to 10 teams, which has meant that those teams under threat of exclusion have had an incentive to succeed on the field and in boosting crowds.  The perennial issue of whether and to what extent the All Blacks will be available for their provincial teams has also been aired again, while the continuing impact of the Ranfurly Shield cannot be ignored (except, it appears, by Wellington)

One hypothesis is that the Air New Zealand Cup, being a domestic competition, should mean that Virtual Rugby participants have more knowledge of the teams and players than may have been the case with the international Super 14.  Countering this may be the greater degree of parochial attachment that such a competition provides, the relatively unknown status of many players and the extent to which removal of key players to international duties will affect team performance.  Also, what should we expect in terms of crowd wisdom?  Random picks should produce a 50% outcome when taken over 91 games and 61,239 participants (I have just tossed a coin 91 times and heads won only 45% but I’m not going to do it another 61,238 times), so crowd wisdom should do much better than that, but by how much?

So, what were the results.  Well, the average success rate of predicting the outcome was 61.45%, which was less than the Super 14 result.  For both outcome and margin, the success rate was 32.29%. Assuming that the crowd that participates in Virtual Rugby has more information than a random generator, is this sufficiently better than 50%?  I don’t think so.

The chart shows the extent to which the outcome (win/loss) success rate moved over the season.  The success rate got up to 90% for the penultimate round, and did improve as the season developed after an initial slump.  The earlier rounds were characterised by “upsets”, and if defined as an outcome that fewer than 20% of participants picked, then there were 2 upsets in each of the first four rounds (out of 7 games in each round), as well as in rounds 7,8 and 13.

Successful Prediction of Outcomes

The higher success rates in the later rounds could reflect better informed participants, having more information on how the teams played, while other possible explanations could be that more All Blacks were playing for their provincial teams and strengthening the larger provinces with more favoured teams, or that the variable of the draw had come together to pit top and bottom teams against each other, with more predictable outcomes (not sure that this stands up to analysis).

The outcome and margin chart tells a similar story.  In round 3 only 10% of participants got the outcome and margin right and in only one round did more than half of participants get it right.


So what about team support?  Well, the participants in Virtual Rugby had mixed results.  They were most successful with Counties-Manukau and Canterbury, i.e. the bottom and top teams from the round robin, while Wellington and Hawkes Bay ended up in the top 4.  Participants were not so successful with the other semi-finalist, Southland, or with Bay of Plenty and Taranaki, who sprang surprises in both wins and losses.  Margin predictions were most successful for Counties-Manukau again (presumably the extent of their losses), and Wellington (win margins), and least successful for Otago.

Successful prediction of margin and outcome by union

Looking at the wisdom of the crowd in picking the final order of teams in the competition (and leaving aside the complicating factor of bonus points), the crowd didn’t do too badly.  A difference of 3 ranking places was the maximum error, three each out of the top 4 and bottom 4 were correctly ranked.


Five ranking levels matched, but Southland did twice as well as predicted, Tasman, Auckland also did much better, but Waikato, Otago, Bay of Plenty and North Harbour did much worse (by at least 2 places).


In a graphical presentation, this chart compares actual ranking with the ranking by the level of crowd prediction measured by the average percentage of win predictions.

So how did the crowd do?  Not as well as it should have, in terms of win/loss outcomes, although it did get fairly close to the final order, with the exception of Southland.  So ok on the big picture. On outcomes, there may have been a degree of home team emotion supporting some predictions, while in some cases teams just played so well or so badly that pre-game predictions became irrelevant and no amount of information or expertise would have helped.  This is probably a good thing, and is certainly what has made this year’s Air New Zealand Cup a great competition.  It’s not broken, so why fix it?

Rugby: The Wisdom of Crowds and Super 14 Picks

17 June 2009

In his book, The Wisdom of Crowds, James Surowiecki argues that the the aggregation of information in groups results in better decisions than could be made by a single member of the group.  I have to admit that I haven’t read the book, yet, but I thought I would test the wisdom of rugby followers who participated in the Sky Sport Virtual Rugby game that ran for the 2009 season.

The hypothesis would be that the aggregation of picks would tend to be pretty close to actual results.  However, we need to look at the criteria that separate wise crowds from irrational crowds, as set out in the Wikipedia entry on the book and subject:

Diversity of Opinion: each person participating will tend to have private information, which will (primarily) be their own eccentric interpretation of known facts

Independence: Not so sure about this one, since I am sure that some people (e.g. me) were influenced by the views of others as reflected in the level of support for particular results, especially where they don’t have any strong eccentric interpretation of their own

Decentralization: People do specialize and draw on local knowledge, but the other side of this coin is that they may well support an outcome favouring their local side irrespective of the facts and experience

Aggregation: Sky Sport Virtual Rugby provides the mechanism for turning private judgments into a collective decision

On this basis, the elements for a wise crowd appear to be there in Virtual Rugby.

Bad judgments can result when the crowd is:

Too homogenous: if there is not sufficient diversity within a crowd – in the case of Virtual Rugby it is likely that there is too great a focus on New Zealand teams, since the majority of the participants are (I assume) New Zealanders

Too centralized: I don’t think this is a problem since there were about 122,000 participants

Too divided: there is ample scope to share information through media reporting and commenting, and through other forms of information exchange, so people can choose what information they need

Too imitative: Choices are visible, in the aggregate, which could lead people to reflect the majority view

Too emotional: Well of course, clearly there will be biases, hearts will dominate over heads, Wellingtonians will support the Hurricanes, despite the evidence

Anyway, that’s enough about the theory, what abut the analysis.

Virtual Rugby is played by making a prediction for each game in each round of the Super 14, with choices being one side or the other to win by either 12 points or less, or more than 12 points, or a draw.

For the purposes of the initial analysis, I looked only at the outcome of the game and regarded a crowd prediction as successful if the actual outcome reflected the views of the highest proportion of players.  This includes predictions of both wins and losses.

On this basis, the crowd had a 69% success rate over the 14 rounds of the 2009 Super 14 season, i.e. it got slightly better than 2 out of 3 right.  Is this good?  I would have thought that it’s not so good.

A more detailed analysis by team and country is interesting:

Bar chart of success rate of crowd picks in Virtual Rugby, by team and country

Bar chart of success rate of crowd picks in Virtual Rugby, by team and country

The most predictable teams tended to be those that ended up in the bottom third of the competition, i.e. they were predicted to lose most of the time.  Not unrelated, given that they had three teams in the bottom third, the crowd did best with South African sides (75%).  It did worst with New Zealand sides, which could well reflect the emotional attachment to local teams.  The most unpredictable teams were the Crusaders and Waratahs, which can possibly be explained by their changing fortunes through the competition.

What about picking winners. The analysis shows that the crowd got it right for 80% or more of the time for the Hurricanes, Stormers and Blues, but less than half the time for the Reds, while crowd opinion was definitely against the Cheetahs.  It could also mean that Hurricanes supporters always pick their team to win.


When picking losses it was a bit of a mixed bag.  The crowd got it right for the Sharks, Bulls and Lions, but was nowhere near it for the Hurricanes and Waratahs.  I think that this supports the view that supporters of these teams let their emotion cloud their judgment.


Did the crowd learn anything as the season went on?  Not sure about that, although there is some evidence to support the proposition in that for the last three rounds at least the outcomes were well-predicted.


So what does this all mean, apart from a suggestion I have too much time on my hands?  What I think it means is that the crowd that does Virtual Rugby is reasonably well-informed but is not too wise. It also means that if you want to win at Virtual Rugby, don’t follow the crowd.

Further analysis comparing the crowd verdicts to the TAB betting odds would be very interesting.