League stability analysis: evaluating how shot ability changes after promotion

It’s not a surprise to most of people, but the Dutch Eredivisie is one of the most interesting leagues in Europe for different reasons. It’s a league where you can develop talent, but it’s also a league that often acts as a feeder league to the top 5 European leagues. This often constitutes the big players and the big clubs, but today I want to talk about something entirely different from that. Let’s talk about the bottom of the table.

I am an expert in the Dutch professional tiers — yes, I can say that at this point — and I have always been fascinated with how attackers perform. Not because of the number of goals, but because of how attackers perform in the lower brackets of the league. I love to explore how attackers with clubs at the bottom of the league perform and how they might continue that after relegation. The same is true for attackers in top teams in the second tier and how they will evolve — or not — in the Eredivisie after they are promoted.

In this research, I will attempt to look at the following question:

What happens to the shot ability and quality of attackers when they move leagues after promotion?

I can’t lie, I loved looking into this and finally bringing some different level of data analysis to two of my favourite leagues. It might be a bit far-fetched at times or not extremely relevant on the surface, but knowing the playing style of leagues and clubs will benefit the overall knowledge of the football culture in the Netherlands.

Table of contents

  1. Data collection and sources
  2. Methodology
  3. Analysis — shot locations
  4. Part of the total shooting
  5. Percentile rank radars — attackers
  6. Striking score
    6.1 Eerste Divisie
    6.2 Eredivisie
  7. Final thoughts

1. Data collection and sources

The data used in this research comes from two data providers: Opta and Wyscout. Two different data approaches will be included in the research and the analysis:

  • Raw event data which we will use for the raw plotting data, creating new metrics, input for our xG model and much more.
  • Match level data is data that has been collected and provided into metrics by Opta themselves.
  • Match level data from Wyscout

With the data, we can do exactly what we need for the research. There will be an expected goals module that will give us xG values, but that’s all based on Opta data and will be explained more in the methodology part of this research.

The data was collected on Friday 12th of July 2024 and consists of the event data from Opta from the 2023–2024 Eredivisie season. The match level data comes from Wyscout, as we don’t have Opta coverage for the Eerste Divisie _ i know, it’s sad.

I will pick two different players as a means of experiment. I want to see how the attackers did in the 2022–2023 season in the Eerste Divisie and how they after promotion, they did as attackers in the 2023–2024 Eredivisie. Those players are Lennart Thy from PEC Zwolle and Emil Hansson from Heracles Almelo.

Corner Threat Score: Measuring the threat a player generates from taking inswinging and…

In football, there are many things to look at from a tactical or coaching perspective and a data perspective. One of…

marclamberts.medium.com

2. Methodology

With all that data you need to be able to make a library or a database so it’s ready to use. A vehicle to load in the data, clean it and manipulate it so it’s ready for the research to use. These are what I have used to make sure the research could have been done with the data sources available:

  • Microsoft Excel: this is where all the data that was stored in a .xlsx or .csv extension was saved in. So when I opened it to look through the data and load it in my programs, I had to use Excel
  • Python: this is used to load the data, manipulate it and make desirable visualisations. I use it to draw shotmaps, heatmaps and radars of data. It can handle data as well as visualise it — it’s what I use the most for this analysis
  • R: my xG model is made from a database of 400.000 shots in the Dutch Eredivisie and will convert the event data of shots into shots with xG values. Expected points are also collected from this model, but it’s mostly concentrated on the xG.

What’s most important is that I have the raw data and I want to visualise different possibilities and outcomes connected to shooting data. To make sure I can analyse and visualise it, I need the aforementioned tools that will help me do so.

3. Analysis: Shot locations

Shot locations will be used to explore where a player conducts his shots from. In the shot maps below you will see the maps of the players in the Eerste Divisie and the Eredivisie next to each other.

In the maps above you clearly see a difference in goals and shots, which can be expected after moving leagues. What I do think is interesting is that the average meters from goal per shot has changed from 10,95 meters to 11,33 meters — which can be an indication of shooting earlier. Having said that, however, the average xG per shot has remained even. This can mean that Thy hasn’t had that much difference in style or shooting between those different leagues.

In the maps from Emil Hansson you see a difference altogether. He has significantly fewer shots than in the 22/23 season. But, he is closer to goal and has a better xG per shot than before. This can be an indication that he tries to shoot from locations that give him a better option of scoring.

4. Part of total shooting

We have seen the shot locations in the maps above, but that only tells us a part of the story. We need to see if the effect of promotion also has the effect on the attacker’s importance on the total. You would say that in a higher league, you are resorted to more defending than before and that the role of said attacker is even greater. Can we see this?

In the pie charts above we can how big the share is of Lennart Thy in the total goals scored for PEC Zwolle in the 2022–2023 Eerste Divisie and the 2023–2024 Eredivisie season. We expected that the share would be higher and it, albeit significantly smaller than we thought before. Thy has gone from 23,2% of goals scored to 24,3% goals scored in the next season, which is a growth of +1,1%.

In the pie charts above we can how big the share is of Emil Hansson in the total goals scored for Heracles Almelo in the 2022–2023 Eerste Divisie and the 2023–2024 Eredivisie season. We expected that the share would be higher and it is. Hansson has gone from 15,5% of goals scored to 17,2% of goals scored in the next season, which is a growth of +1,7%.

When the teams are in the higher leagues they resort more to the defensive side of play and the emphasis on scoring goals is trusted with fewer players than in the season where they achieved promotion.

5. Percentile rank radars

In the percentile radars above you can see two radars that look quite identical in some regards and that give me the idea that the player does very well in both leagues. However, there are some little differences: in the shooting metrics Thy really does well, but has become less impactful in aerial wins % and progressive passes. We can make the conclusion that the focus on shooting has become more important and playmaking less important.

While Hansson has become more important for the team to lean on, his quality and being in the top percentiles, something has significantly changed with promotion to the Eredivisie. Especially in the shooting metrics Hansson has gone from the high 80s and 90s to the high 60s and 70s. The differences in leagues and their quality has had an effect on the performances of Hansson.

6. Striking score — CF

From percentile ranks we move on to z-scores. One of the key benefits of z-scores is that they allow for the standardization of data by transforming it into a common scale. This is particularly useful when dealing with variables that have different units of measurement or varying distributions.

By calculating the z-score of a data point, we can determine how many standard deviations it is away from the mean of the distribution. This standardized value provides a meaningful measure of how extreme or unusual a particular data point is within the context of the dataset.

Z-scores are also helpful in identifying outliers. Data points with z-scores that fall beyond a certain threshold, typically considered as 2 or 3 standard deviations from the mean, can be flagged as potential outliers. This allows analysts to identify and investigate observations that deviate significantly from the rest of the data, which may be due to measurement errors, data entry mistakes, or other factors.

With Z-Scores, we calculate the score 50, differently from the 50th percentile. While percentiles look at the middle value or median, Z-scores look at the average or mean of the dataset.

Ranking players: Percentile ranks, Z-Scores and Similarities

There has been a huge shift in the use of data in football in the last few years. It would be foolish for me to claim…

marclamberts.medium.com

By giving weights to z-scores we can calculate a role for attackers and in doing we can see how well they are within a league. This makes for an interesting concept to see if the fit changes with a promotion to a higher league.

In this example, I will only focus on the striker position. For Lennart Thy, we can conclude that he has a goalscoring striker score of 88,14 out of 100 for the Eerste Divisie 2022–2023. A season later in the Eredivisie his score is 85,25. It seems that he fits better as the goalscoring striker in that specific Eerste Divisie season than he did for the Eredivisie season. This is a logical conclusion as it’s harder to perform in the higher league within that specific role than it is in a tier below.

7. Final thoughts

What I wanted to achieve with this little research was to look into the effect and importance of a promotion. Roles change, dependences change and also the impact of players change. In this first part of the research we have seen how players will grow in their percentage of goals, but might have a more restricted role in relation to what they had previously. Attacking players need to be there primarily for scoring goals in a division higher because you can’t afford to not take those chances.

The next article will focus less on how stable a player engages in attacking actions but will pose to create a metric that measures what promotion does to a player’s performances. To make an adjusted score to see how well a player is doing when coming from a tier lower.

  • Using AutoRegressive Integrated Moving Average (ARIMA) to predict future shot locations for Liverpool in Premier League
  • Progressive Long Pass Score: giving meaning to a long pass from the start location
  • Throw-in success: generating shots through emphasis on throw-in routines
  • Actionable analysis: Individual Header Rating (IHR) determines choices in blockers vs runners
  • The complexity of outliers in data scouting in football
  • Four things to pay attention to when you start analysing corners