Investment banks use AI algorithms to make World Cup predictions

Banks put their predictive analytics tools to use, using machine learning to take a shot at calculating which nation will be crowned winners of the World Cup 2018.

Key Takeaway

AI can also be all fun and games, when a banker acts as a bookmaker. While analysts have long been using external and alternative data to inform decisions from deals and investments, to fraud mitigation, to creating credit reports, predicting the football World Cup winners has more recently become a data-driven quadrennial exercise around the world. Today, investment banks are flexing their external data analysis muscles and betting on tournament predictions based on artificial intelligence, statistical modeling, portfolio theory and economic analysis, using alternative data.


Every World Cup season, analysts around the world attempt to use the latest data analysis tools to predict who will come out victorious, often using statistical modeling and insights gleaned from external data sources to tip the pot in their favor. With all the progress in data analysis over the last 4 years, have we finally gotten closer to accurate predictions?

Goldman Sachs: England vs. France/Brazil

Predictions by Goldman Sachs suggest that England will make it to the final game of the tournament against Brazil or France, following the team’s recent win against Panama. These conclusions are based on machine learning algorithms the team created and over time determine which variables end up having the biggest impact on the outcomes.

A team of strategists from Goldman Sachs wrote, “We are drawn to machine learning models because they can sift through a large number of possible explanatory variables to produce more accurate forecasts than conventional alternatives.”

The bank used machine learning algorithms to run over 200,000 models, analysing data on individual teams and player attributes, in order to forecast potential scores in specific matches.

Here’s how their model made a prediction:

“We feed data on team characteristics, individual players and recent team performance into four different types of machine learning models to analyse the number of goals scored in each match. The models then learn the relationship between these characteristics and goals scored, using the scores of competitive World Cup and European Cup matches since 2005. By cycling through alternative combinations of variables, we get a sense of which characteristics matter for success and which stay on the bench. We then use the model to predict the number of goals scored in each possible encounter of the tournament and use the unrounded score to determine the winner.”

The models so far correctly predicted that England and Belgium would be victorious in their recent games, with Belgium moving to the top of Group G, and England as the runner up, and also that Germany will finish second in its group to place it in a round of 16 game against Brazil.

This also accurately predicted that England is to play against Columbia (winners of Group H) in the round of 16 stage, where the model predicts England to win. Good news for football lovers in the UK.

“This tips the outcome in England’s favour, as they are projected to defeat Mexico, before overcoming Spain — just — in the semi-finals,” the report said.

This series of events (if the model predicts correctly) will result in a Brazil vs. England game on July 15, with heavy predictions on Brazil’s ultimate victory. However, the report also suggests that France may also have a great chance of winning, followed second by Brazil, Belgium and then finally England.

ING: Spain

The Dutch bank used models that would judge team success based on the market value of individual teams and past performances, with the primary assumption that value and past successes are correlated. Team worth was calculated depending on individual player transfer value estimates, and their track record from FIFA world rankings.

This analysis resulted in Spain as the ultimate winner, with a team valued at €1.04 billion ($1.16 billion), and France coming in second and valued at €1.03 billion.

Nomura: France vs. Spain

The Japanese investment bank used models similar to those used in risk analysis of investments, to make World Cup predictions.

The team at Nomura says, “Being analysts, we have to apply some rigor to our World Cup predictions, so we’ve decided to apply portfolio theory and the efficient markets hypothesis to the World Cup. We look at the value of players in each team, the momentum of team performance and historical performance to arrive at three portfolios of teams to watch.”

The results in this case suggest that France, Spain and Brazil will most likely make it to the semi finals, after which France will play Spain in the final game of the tournament.

New external data points can be used to inform all types of decisions, from these more casual exercises on World Cup predictions to major financial investments.  In this case, for instance, analysis based on individual player data can be used to work out not only how a specific team will perform, but also the impact of swapping players in and out of the team.

Big data and machine learning have the capability to change the forecasting and investment landscape. As quantity and accessibility of data grows, investors are able to evaluate how they can leverage data analysis to make increasingly more informed decisions.

In any case involving machine-driven predictions, data feeding the algorithms needs to be understood well in advance of any significant investments or strategic decisions. In the case of these World Cup predictions, it is important to remember that football is a fairly unpredictable game, and only stochastic predictions of the tournament can be calculated. Even with the use of state of the art statistical techniques, predictions for any events involving human variables will never quite reach 100% accuracy – that’s where human judgment continues to play a big role.

Recent Articles