With top class snooker returning on Monday Timeform's James Cooper uses a simulation method to predict the outcome of the 16 groups. Here he explains how it works...
"Mark Allen has been priced up very defensively by the layers in Group 11 and looks a vulnerable favourite as the model gives him a less than a 50% chance compared to Betfair's 65%."
Top-class snooker will join racing as the first sports to return in the UK since the March lockdown and fans are treated to big guns from the off. Defending World Champion and the number one ranked player Judd Trump takes to the baize live on ITV4 in a tweaked Championship League competition.
Featuring 64 players, a round-robin phase will be followed by a second iteration populated by group winners. The final four then play a final league amongst themselves to determine the outright winner. All of which played under strict social distancing protocols at Marshall Arena in Milton Keynes, with the player's safety at the forefront of organiser's minds.
Simulated group predictions
Timeform have long used a data driven approach to objectively analyse horse racing and a similar approach was employed when they launched Infogol. Being a massive snooker fan and punter, I share this approach for pricing matches and in this case, groups.
Intuition and feel will always have a place when it comes to betting, but for a punter to give themselves the best possible chance of returning long-term, sustainable profits, it's vital to be able to put an accurate numerical figure next to the likelihood of any potential bet.
Compared to horse racing and team sports, snooker has comparably few variables given its head-to-head nature, defined scoring/match boundaries and indoor playing conditions, thus making it ripe for modelling purposes. The methodology for deriving frame and therefore match probabilities can be found later in this piece for those that are interested to learn more.
Having used this for all 96 group ties, each group was simulated 10,000 times to attain the likelihood of each player advancing. This produced the following results (groups in schedule order):
It's well worth mentioning that that are a couple of caveats to these probabilities.
In the outcome of a tie after all group matches are played, frame difference, head-to-head result between top players and highest break are the determining factors, in that order. In the event of a tie in each simulation, the group winner is selected at random which means the chance of the favourite coming out on top in each group is likely underestimated, albeit by a small degree.
The other condition is that the player ratings do not consider the enforced absence. There is a school of thought suggesting favourites could be vulnerable, but equally the top players may have had access to the best facilities during lockdown. To some, this could potentially be an angle to exploit. With a lack of appropriate background data however, that sort of conjecture is overlooked for modelling purposes.
A trio of potential wagers
The players in bold in some of the above groups are worth a second look given the odds available with Betfair Sportsbook at the time of writing, with Jordan Brown to take Group 1 perhaps the most interesting betting proposition.
The Northern Irishman already has a best-of-9 win over short-price group favourite Stuart Bingham to his name this season (for all he did lose to Bingham in a Home Nations event) and by no means for the first time, the market and I differ when it comes to assessing the merit of Brown. The 7/1 (12.5% chance) looks generous.
Mark Allen has been priced up very defensively by the layers in Group 11 and looks a vulnerable favourite as the model gives him a less than a 50% chance compared to Betfair's 65%. Martin O'Donnell is the obvious alternative at the current odds of 7/2 (22%). He's a clear-second pick at 22.83% to top the group upon simulation.
A potential third wager is Kurt Maflin in Group 16. There's no real desire to take on Neil Robertson aside from the fact he's extremely short. Maflin has probably slightly underachieved in his career relative to his talent but the disparity in the betting between himself and Ken Doherty looks out compared to the model (the market has a gap of 8% not 11%) and he represents a shade of value.
Group favourites worth taking on?
Curiously, even allowing for a marginal underestimation of players at the head of the betting in each group, there isn't a single favourite the model views as being value compared to the current prices available on Betfair Sportsbook.
Without access to individual match odds at present, it's hard to draw any concrete conclusions at this point, but it's very possible most are shorter on every jolly to prevail in individual group matches with compilers perhaps underestimating the likelihood of the draw in each match.
This may be an angle to explore over the coming days and with seven of the world's top 10 on show over the two weeks, it should be essential viewing for fans of the sport - new and old.
How modelling works
For those interested to learn more, the best way to approach modelling on a sport like snooker is to break each match down into its simplest form.
The objective for all first-round competitors is quite simply to reach a designated frame score before the opponent, so in a best of seven match, reaching four frames first. To do that, one must model the likelihood of each player winning one frame, which is achieved using player ratings. Compiling accurate player ratings is the hardest stage of this process and there are various ways to do it.
There are strict automated methodologies that some employ, though a hybrid approach is preferred to add nuance that some numbers may not be able to capture - very much like handicapping in horse racing.
Once you have a workable set of player ratings and single-frame probabilities, match odds, handicap, correct scores and over/under frames can all be attained very quickly using a Beta Distribution. Ratings are the key, with it being very much rubbish in seeing rubbish out.
As mentioned earlier, next week's Champions League tournament poses a slightly different question to what most models are used to, given that it is a best of four format, so a 2-2 draw comes into the equation. The premise remains the same though, for a player to win and take the three points on offer in the round-robin group of four, they must win three frames before their opponent wins two, anything less will result in a draw or a loss.
Traditionally, summing the correct score odds can reveal the match prices, but the only equation that matters here is reaching three frames, and an example is listed below.
The model makes Bingham around a 4/7 chance to win a single frame, with Brown the reciprocal 7/4 shot. Match prices can then be derived, with Bingham now 5/6 to cover a theoretical -1.5 handicap and take the match, with Brown chalked up at 6/1 and the tie an 85/40 chance.
*For those interested in learning more, you can follow me on Twitter at @databayes