AI Models Struggle to Beat Bookies Over a Premier League Season
Eight advanced AI models lost money betting on a simulated English Premier League season.

Eight AI models failed to outperform bookmaker-implied probabilities when put to the test. © Getty Images
Key Facts:
- Eight advanced AI models failed to beat the bookies betting on 2023-24 Premier League results.
- The best performing AI system lost 11% of its bankroll over the season.
- Two systems went bankrupt before the season ended.
Artificial intelligence may be changing the world, but a recent study suggests the technology is far from ready to beat the bookies. In fact, following the advice of some AI systems from leading technology companies could be a fast track to financial ruin.
That is the summary of findings from a recent study by London-based start-up General Reasoning. It simulated a full English Premier League season (2023-24) and asked eight advanced AI models – including systems developed by Google, OpenAI and Anthropic – to make bet selections using historical data and evolving match information.
Each model was given a virtual bankroll and instructed to maximise returns while managing risk over time. Despite access to extensive datasets – including past results, team statistics and betting odds – none of the systems ended the season in profit. In fact, several went bankrupt during the simulation.
AI Digs Two £100,000 Holes
The AI models had been fed every piece of information that quantitative bettors normally rely on, historical match results, player and team statistics, plus additional factors such as weather, travel distance and even stadium altitude.
The AI models had two markets and five potential match outcomes to choose from: home win, away win, or a draw, plus an option to bet over or under 2.5 match goals. Odds were primarily taken from a top UK online bookmaker.
However, despite access to extensive datasets, all of the trialled systems ended with a negative balance. Two went broke before the end of the season, losing the theoretical £100,000 they started the challenge with.
11% Loss Claude Opus 4.6 Wins the League
The strongest-performing model was Anthropic’s Claude Opus 4.6. While it was never in positive territory over the nine-month season, its closing balance was £89,035. At no point did its bankroll drop more than 15% of its starting point.
Similarly, OpenAI GPT-5.4 was never in profit after the start of October (approximately seven weeks into the season). However, its bankroll never dipped below £80,000 and it ultimately finished a close second, ending the season with £86,365.
Google Gemini Flash 3.1 LP got the season off to a flier with its bankroll surpassing £250,000 within the first three weeks. However, by the start of October, it entered negative territory and never recovered, eventually finishing with £41,605.
It was Arcee Trinity that suffered the most spectacular loss. Dropping almost £60,000 on week one bets (in mid-August), it had blown its £100,000 by the third week of September. The second AI model to suffer a complete loss was AI Grok 4.20. Its decline was more gradual. Its money was gone a month shy of the season’s end.
Fault Lies With AI’s Failure to Execute Strategies
General Reasoning’s study, titled ‘KellyBench: Can Language Models Beat the Market?’ reports that the two strongest models, Opus 4.6 and GPT-5.4, share several traits.
Both models retrained or adjusted their strategies in response to new match data, deployed systematic staking rules, and preserved capital during periods when their strategies identified no edge.
In its conclusion, the study states: “The benchmark exposes failures not only in machine learning modelling, where models struggle to outperform bookmaker-implied probabilities, but more fundamentally in the closed-loop reasoning required for long-horizon sequential decision-making.”
“Models can write sophisticated code, diagnose their own failures, and articulate correct strategies, yet persistently fail to execute those strategies reliably, monitor their own performance, or adapt when their approach is not working.”
“As well as looking at performance, we judged strategy sophistication for each model and found existing models to have unsophisticated strategies compared to human approaches.”
“In particular, rich player-level data available in the environment was almost universally ignored in favour of simpler team-level features, suggesting that current models systematically underinvest in data and feature engineering when operating autonomously.”
Pragmatic Play's games have proven a huge hit with both our readers and the review team! Thanks to a direct line to their workshop, we can now share the hottest new releases launching this April:
Gates of Olympus combines real dealer play with exciting bonus round graphics.

AI Models Struggle to Beat Bookies Over a Premier League Season
Big Bass Trophy Catch Coming from Pragmatic Play
Play the New Playtech Slot King Richwin on 15 April