What has always fascinated me is the way we look at football players and the value we give those players. There is a pecking order of course, because in terms of the general football audience, we tend to value entertainment. Entertainment is something you can directly compare to the ones making the goals (which is good) and those conceding them (which is bad). This idea has been with me for a long time and I want to look a little deeper into this idea.
Read more: The idea behind the threat: Creating Pass Progression Score (PPS)I think I grew up with that idea too, that I valued goals more than anything else because goals eventually will make the difference between winning, drawing, or losing. However, when I started learning more and more about the game — I realised: it’s only about the output, but also the process of getting there. And, that’s what I’m doing more and more. That’s why today I want to look at creating a new metric: Pass Progression Score.
Contents
- What is Pass Progression Score?
- Why do we need it?
- Data
- Methodology
- Visualisation
- Final thoughts
What is Pass Progression Score?
Pass Progression Score is a metric that combines a few different metrics that indicate how much progression there is by a specific player by looking at the total number of passes and calculating the progressive value of it. The specifics and methodology will come later in this article.
The score will show how much a player in a specific position contributes to progression and how much of his/her total contributes to progression. This gives us an idea of progression.
Why do we need it?
Well, needing it is quite the statement, but I think it will be interesting and help gauge whether a player is a progressive passer. In recruitment processes we have to look at data over so many players and to make our lives easier with a few steps, we can create scores to look at intention. Yes, that’s right — intention. The intention of the players can help us get a better idea of the style of the player and if you are looking for a player that has a certain progressive passing profile, this is a metric that can really help you.
Data
The data we are using for this metric comes from Wyscout. Like I’ve said before, it’s not the best quality provider out there but it has the widest coverage. The data we are using is from the Belgian First Division 2023–2024 and we are only using players that have at least played 900 minutes — which is the equivalent of 9 full matches. The data was collected on June 9th, 2024.
The data will be selected and will contain only a few specific metrics, which will then be used in the calculation for the newly created metric. You can see that below.
Methodology
So how am I going to make this score? I will do this in Python, but there are 3 steps I need to take:
- Drop all the information I don’t need. I will keep the player name, team name, minutes played, and the metrics I use.
- The metrics I’m using are: Passes to the final third, Passes to the penalty area, Key passes, Through passes and Progressive passes. All are per 90 minutes and not totals.
- I will weigh the different metrics for how much they contribute to progression: Passes to final third (1), passes to penalty area (2), Key passes (1), Through passes (1), and Progressive passes (3). The key aspect is here that progression is more valuable to me when it comes closer to the opposition’s goal.
- I will calculate them into z-scores, which will make it easier to create a weighted total score.
To create a score that goes from 0–1 or 0–100, I have to make sure all the variables are of the same type of value. In this, I was looking for ways to do that and figured mathematical deviation would be best. Often we we think about percentile ranks, but this isn’t the best in terms of what we are looking for because we don’t want outliers to have a big effect on total numbers.
I’ve taken z-scores because I think seeing how a player is compared to the mean instead of the average will help us better in processing the quality of said player and it gives a good tool to get every data metric in the right numerical outlet to calculate our score later on.
Z-scores vs other scores. Source: Wikipedia
We are looking for the mean, which is 0 and the deviations to the negative are players that score under the mean and the deviations are players that score above the mean. The latter are the players we are going to focus on in terms of wanting to see the quality. By calculating the z-scores for every metric, we have a solid ground to calculate our score via means.
We talk about harmonic, arithmetic, and geometric means when looking to create a score, but what are they?
The difference between Arithmetic mean, Geometric mean and Harmonic Mean
As Ben describes, harmonic and arithmetic means are a good way of calculating an average mean for the metrics I’m using, but in my case, I want to look at something slightly different. The reason for that is that I want to weigh my metrics, as I think some are more important than others for the danger of the delivery.
So there are two different options for me. I either use filters and choose the harmonic mean as that’s the best way to do it, or I need to alter my complete calculation to find the mean. I am doing the harmonic mean.
Visualisation
By running the code and calculation in Python — I will get a list. Now, that’s just a very boring-looking list, so I’m turning it into a visualisation. In the image below you can see the 10 best progressive pass score (PPS) players with at least 900 minutes in midfield.
In the table above I have ranked the top 10 midfielders according to this new metric. They have a score from 0–100 and in that way we can see how well they are doing in this metric.
What is something we can conclude from this table is that Tresor scores significantly higher in this score than the others on this list, meaning that he scores far above the mean and is an excellent intentionalist in terms of progressive passing.
Final thoughts
I like to play around with metrics and see how they can aid myself in the process of recruitment, especially in the phase where I use data quite heavily.
Progression can be measured in different ways and that’s also why I think there is still work to be done on this metric. If you connect OBV, xT or xPass to these metrics — we can delve even further. In combination with the vlaue of event data, the 2.0 version of PPS will be even more meaningful.
Geef een reactie