“Data does tell facts and therefore always is the truth” — how often I’ve heard this, I can’t even count the number. I think this is such a weird way of looking at data, as data is a construct so if anything, it’s always subjective and prone to bias. One of the cases is how you deal with match-level generic data provided by data providers.
Read more: Differences in shooting styles across Regionalliga — Germany’s 4th tierWithin Wyscout you have a scala of different leagues that are provided with data. And, in general, I think that’s really great to have such coverage over different leagues all over the world. Now, what I have noticed is that these leagues are often categorised in terms of tiers. For example, you have Serie C in Italy (3rd tier), National League North/South in England (5th tier) and Regionalliga in Germany (4th tier) — which are all covered in the data within the same dataset, despite having different leagues. That’s why I’m going to look a little deeper into the data for the Regionalliga.
The idea is to see how shots are conducted in the different subdivisions of this tier. Not every league has the same style of play, which will also be reflected in the data — without making the distinction, the data is effectively skewed.
The leagues
There are 5 different leagues within the 4th tier we call the Regionalliga:
- Regionalliga Nord
- Regionalliga Nordost
- Regionalliga West
- Regionalliga Südwest
- Regionalliga Bayern
The first thing we will need to clarify is that we need access to all data to make something work and that our conclusion from this article needs to concise as possible. So, here we encounter the first problem: only 4/5 leagues are completely accessible in terms of data. Regionalliga Nordost has two teams available in terms of data, so we have to exclude them from what we are trying to achieve here.
That still leaves us with approximately 2000 players across four leagues that will make up the style of each division.
Method
In the method, we look at how I will gain results. The aim is to look at how the different leagues have similarities/differences. I want to look at two different things:
- The volume of shots per league by looking at the total shots.
- Expected goals per 90 and looking at the differences between the leagues.
I will use the Wyscout data and analyse these metrics, after which I will try to visualise it.
Shots
In the bar graph above you can see the four Regionalligas we are looking at and we can see the shots per 90 per league.
As we can see the volume of shots per 90 is the highest in the Regionalliga Nord and the lowest in the Regionalliga Südwest. West and Nord are above the average while Bayern and Südwest are below the average.
It’s too early to have a conclusion ready but it looks like the emphasis on more shots is prevalent in Nord and West, which could indicate they look to shoot more.
Expected goals
In the bar graph above you can see the four Regionalligas we are looking at and we can see the xG per 90 per league.
It gives us the same idea, but Regionalliga Nord is just a different league in comparison with West and Bayern. Südwest is the other outlier and we can draw a simple conclusion: there are fewer shots and as a consequence, there is also a lower xG per 90 in that specific league.
Conclusion
I think it’s very hard to draw definitive conclusions just from a few data points, but it has to trigger your mind that not all regional leagues are the same.
But, if we are looking to use these metrics for forwards, there are two quite interesting conclusions to draw:
- If you have a high number of shots and xG in Regionalliga Südwest, that means more than in the Regionalliga Nord.
- If you have low shots in Regionalliga Südwest it’s less damning than having it in Regionalliga Nord
We can of course go deeper into this, but it’s important to link the individual clubs to the league they are playing. If we want to gauge whether a match can be made to the 3. Liga for example, it’s important to look at the league the player is in. Not all 4th tiers are the same.