As if by fate, a particularly poor "trends analysis" piece dropped through my letter box on the day I wrote last week's blog about the dangers of poor trends analysis.
In it, the author not only contradicted himself a number of times but broke many of the common-sense rules that should underpin all forms of analysis, not just those dealing specifically with so-called "trends".
I was particularly drawn to the statement "13 of the 20 renewals (of the Champion Bumper at Cheltenham) have been won by five-year-olds too."
This is a fact, not a myth, but it is the kind of fact that is either meaningless or downright misleading without proper context.
Let us consider another fact concerning the Champion Bumper at Cheltenham: it is a race restricted to horses aged four, five and six years old. And another: five-year-olds have accounted for well over half of the total runners in the race this century.
In order to determine whether such a "stat" is meaningful, it is useful to establish how many five-year-old winners could be expected in the normal run of things.
The answer - considering just races from this century, not the last - is that roughly 6.4 five-year-old winners could be expected by chance, and there were 7.0 winners. This, I hope we can agree, is nothing to get all that excited about.
The figures for the other two age-groups are: 1.7 expected, 1.0 actual, for four-year-olds; and 2.9 expected, 3.0 actual, for six-year-olds. Random expectation is arrived at by summing the probability of horses in a given category winning a race by chance (1/N, where "N" is the field size in question).
As previously mentioned, however, considering winners alone is a crude measure, unless you have a large sample, and a large sample is not possible if race-specific trends are to remain relevant. If a trend exists, it should affect losers as well as winners, and it should affect the degree to which winners win and losers lose.
An alternative approach is to consider average % of rivals beaten, which crucially distinguishes between a loser that has finished second and one that has finished twenty-second, and between a second in a five-runner field and a second in a twenty-five runner field, as well as drawing (in this instance) on information from 258 horses and not just the 11 winners.
By this measure, there is little to separate the three age groups: four-year-olds accounted for 47.1% of their rivals; five-year-olds 50.3%; and six-year-olds 50.9%.
Observing that "13 of the 20 renewals have been won by five-year-olds", without any further context, is pointless, or worse, as is observing that the same race has been won "15 times by raiders from across the Irish Sea". But at least what is seemingly being implied by the latter has some worth.
Irish-trained horses have won 8 of the renewals this century, when 4.8 could be expected by chance (British-trained horses have won 3 when 6.2 could be expected). More to the point, Irish-trained horses have accounted for 55.4% of their rivals on average, indicating a more robust advantage.
Other interesting "trends" for the Champion Bumper involve experience and ability. Horses with 3 or 4 runs to their names accounted for 51.4% of rivals on average (6 wins compared to 3.8 expected by chance), and such horses with one defeat from those 3 or 4 runs accounted for 58.0% of rivals on average (3 wins compared to 1.2 expectation).
Even more impressively, horses with a Timeform rating of 117 or higher accounted for 65.7% of rivals on average (4 wins compared to 1.4 expected).
So, if you are intent on a profile to look for in the Champion Bumper at Cheltenham, it may as well be "an experienced Irish-trained horse that has achieved a high level of form, with a sole defeat being acceptable". It should matter little whether the horse is a five-year-old or six-year-old, and even four-year-olds can be considered.
The Willie Mullins-trained Sizing Tennessee fits the bill, is reported a likely runner and is, for what it is worth, carrying some of my ante-post cash.
But an even more valuable lesson to learn is that "stats" which mention wins without any reference to expectation are essentially meaningless. Any implication otherwise represents a myth that is just begging to be busted.