Corner Threat Score: Measuring the threat a player generates from taking inswinging and outswinging corners

In football, there are many things to look at from a tactical or coaching perspective and a data perspective. One of the things I love looking at are set pieces, corners in particular. There are not many corner-specific metrics out there and the available ones, mostly focus on the result of the corner in terms of expected goals (xG). In this article, I will explain the new data metric that I have created: Corner Threat Score (CTS), which is divided into Corner Threat Score Inswingers (CTSI) and Corner Threat Score Outswingers (CTSO).

What is CTS?

The Corner Threat Score is a metric that I’ve created to measure the threat of the delivery by the player taking the corner. It measures the threat of the cross/pass, the zone it arrives in, the possibility of direct contact resulting in a goal and the swing of the corner being taken.

The metric is a score which measures the threat on a scale from 0 to 100, and the higher the number the more danger you create in relation to the other takers of corners in the specific database.

CTS is a combination of different data metrics and metrics that have to be created from event data to capture the threat that can be attributed to a delivery from a corner.

Why do we need CTS?

As mentioned above, I felt that we lacked a data metric that properly measures the threat a delivery brings to a corner. We mostly talk about the shots or defensive actions that are the result of the delivery, but not so often about the quality or threat from the delivery itself. So to analyse and reflect on the threat a kicktaker brings to the table, I’ve created this metric.

This metric will help in creating a more detailed analysis for corner analysis and how to tweak them to get the most potential out of this particular set piece. In the professional elite game, the margins are very small and every little tweak can really help with getting more advantage.

Set piece analysis: Expected goals on target (xGOT) in Eredivisie 21/22

Whilst we are already a few rounds into the Eredivisie 22/23 season, it’s still very interesting to have a look at the…

marclamberts.medium.com

Data provider

The data used to create the metric is all from Opta. It’s not from FBRef, although it’s a very good source. The data comes from the event data or x-y data that is from Opta. That is the starting point for everything in this data metric.

This is what the event data looks like. From this database, I will calculate the passes, the corners, the xG and the xT. I will explain why and how, below, but these metrics don’t exist at first and all need to be generated or created.

I think in general it’s good to know that yes there are many metrics ready to use from sources like Wyscout, FBRef, Statsbomb and Opta for example — but having access to event data gives you a lot of freedom to make your own things. And, that’s exactly what I have been doing.

Data explainer

So, we got the provider we need. The next step is to look at the data and select the metrics we want to use. We want to look at what danger the delivery can give us and there are a few metrics that I want to generate for this analysis:

The corner: in the data, you can filter for corners taken in the games you are looking at it. There is a distinction between outswinging corners and inswinging corners.
Expected threat: what’s the expected threat of the cross/pass taken from the corner by the individual player?
The basic idea behind xT is to divide the pitch into a grid, with each cell assigned a probability of an action initiated there to result in a goal in the next N actions. This approach allows us to value not only parts of the pitch from which scoring directly is more likely, but also those from which an assist is most likely to happen. Actions that move the ball, such as passes and dribbles (also referred to as ball carries), can then be valued based solely on their start and end points, by taking the difference in xT between the start and end cell. Basically, this term tells us which option a player is most likely to choose when in a certain cell, and how valuable those options are. The latter term is the one that allows xT to credit valuable passes that enable further actions such as key passes and shots. (Soccerment)
Expected goals: measures the quality of a chance by calculating the likelihood that it will be scored from a particular position on the pitch during a particular phase of play. This value is based on several factors from before the shot was taken.
Zonal danger: measures the zonal danger of the endpoint of the corner. This only happens if there is no direct contact from the corner.
First and second phase shots: these are shots coming from corners and make a distinction between shots that are result of direct contact or of a second action.

Methodology

So how do we go from raw positional data to a score? There are four important steps to take in which many things need to be thought of, otherwise, it won’t grasp what we are trying to do.

The first step is to generate what we need from the event data to make metrics. First of all, we need to make sure that passes we are looking at are only corners. And from that we essentially produce three different corners:

Inswinging corners
Outswinging corners
Straight corners

After we have done that we have the basics ready because we have the corners. Next, we have to look at the expected threat. We calculate the expected threat from the passes and use a grid to determine the threat.

We use the beginning location of the corner and the end location. We calculate if the endpoint is a positive or negative value based on the grids and the starting point. For example, from a corner to the half-space outside the penalty area is a negative xT, but from a corner to the six-yard box is a positive xT.

Shots are already in our database, but we need to make a distinction between shots that are the result of direct contact with the corner and shots that come as a result of the second phase.

The expected goals are calculated from the shots in the event data. They are run through a model that is trained by 400.000 shots. This model assigns xG by looking at the shooter and the locations of the shooter in combination with different variables like shot location, shot situation, game situation, which body part, how it is assisted and the game state. This model is its own and will be less accurate than Statsbomb, Opta or Wyscout models, but it still gives a pretty good indication of what we can expect.

The second step is to grab the metrics that we use and put them into the same kind of variables so we can calculate a score. This is needed because every variable has its different numerical value.

To create a scorethat goes from 0–1 or 0–100, I have to make sure all the variables are of the same type of value. In this, I was looking for ways to do that and figured mathematical deviation would be best. Often we we think about percentile ranks, but this isn’t the best in terms of what we are looking for because we don’t want outliers to have a big effect on total numbers. I’ve written about it earlier:

Ranking players: Percentile ranks, Z-Scores and Similarities

There has been a huge shift in the use of data in football in the last few years. It would be foolish for me to claim…

marclamberts.medium.com

I’ve taken z-scores because I think seeing how a player is compared to the mean instead of the average will help us better in processing the quality of said player and it gives a good tool to get every data metric in the right numerical outlet to calculate our score later on.

Z-scores vs other scores. Source: Wikipedia

We are looking for the mean, which is 0 and the deviations to the negative are players that score under the mean and the deviations are players that score above the mean. The latter are the players we are going to focus on in terms of wanting to see the quality. By calculating the z-scores for every metric, we have a solid ground to calculate our score.

The third step is to calculate the CTS. There are a few options you can use for that and you can also ready more about the mean, in this article by Ben Griffis:

Introducing “Passing Danger Index”: Measuring The Immediate Danger A Player’s Passes Pose to…

Article by Ben Griffis What players tend to pose the most immediate danger to the opponent’s defenders and goalkeeper…

cafetactiques.com

We talk about harmonic, arithmetic and geometric means when looking to create a score, but what are they?

The difference between Arithmetic mean, Geometric mean and Harmonic Mean

As Ben describes, harmonic and arithmetic means are a good way of calculating an average mean for the metrics I’m using, but in my case, I want to look at something slightly different. The reason for that is that I want to weigh my metrics, as I think some are more important than others for the danger of the delivery.

So there are two different options for me. I either use filters and choose the harmonic mean as that’s the best way to do it, or I need to alter my complete calculation to find the mean. In this case, I’ve chosen to filter and then create the harmonic mean:

Filter out negative xT values going back to a defender, goalkeeper or deep in midfield -> I want to calculate danger from the corner kick
Give different weights to the xT from a second and shots from the second phase of corners -> these have to do with what’s happening in the penalty, but less with the delivery of the corner

It’s important to stress that I’m looking for the most dangerous corner deliveries and not the most efficient.

Using the harmonic mean now, will lead to what I want to get out of it: an score from 0 to 100 that gives the danger coming from the delivery.

Metric use

CTS can be used for every player taking a corner in the game, but it’s not a metric that I would advise to use in isolation. It measures the delivery threat coming from a corner from the player taking a corner and it can be used very well on three different occasions:

Analysing an individual player and focusing on individual coaching/training on how to improve their delivery.
Analysing and concluding which delivery will be most successful in relation to the routines in set piece analysis a team will take.
Setting your defence up against a player with a CTI — inswinging or outswinging — can help in preparing you for a game.

As you can see above the use of the metric is largely tailored to set piece analysis, set piece consulting and set piece coaching. It’s a metric that assists the basic set piece analysis and can really give more details into the actions of specific players during corners.

Quantity-adjusting: CTS per corner

Corner Threat Score Inswinger (CTSI). Source: Marc Lamberts

In the table above you can see the CTS for inswinging corners and these are scores for all the corners taken by the individual. The total threat is listed and you can see that some names pop up that you might expect and some that you don’t expect. 1 is the player with the highest threat from corners and 0,000000001 is the player with the lowest threat from corners.

Now this gives us an idea of the total CTS, but if we want to make it more representative we need to go one step back and calculate the CTS per inswinging corner taken. This gives us a better image of whether a player is consistent in his threat via his delivery.

Example I: Premier League — Inswingers

In the table above you can see the score of the players with the higest danger from inswinging corners. The score goes from 0–100 based on the players that actually take corners, with 100 being the score where the player scores highest in every metric included in the score.

As you can see Saka from Arsenal scores highest with 70,65, followed by Ward-Prowse from West Ham United with 65,76 and then we have a drop to 49,5 and we see Trippier from Newcastle United.

These players create the highest threat and danger with ball swinging toward the goalkeeper and will likely be closer to the six-yard box. Their inside boot kicking technique will likely be very important in striking the ball.

Example II: Premier League — Outswingers

In the table above you can see the score of the players with the highest danger from outswing corners. The score goes from 0–100 based on the players that actually take corners, with 100 being the score where the player scores highest in every metric included in the score.

As you can see Ward-Prowse from West Ham United scores highest with 72,94, followed by Doughty from Luton Town with 61,07 and then we have Bruno Fernandes from Manchester United with 58,13.

These players create the highest threat and danger with ball swinging away from the goalkeeper and are often attacked by runners from deep. Their inside boot kicking technique will likely be very important but will have more effect when they go deeper into the penalty area.

Challenges

There are quite a few challenges with designing and creating a new metric. But I would say there are a few difficult ones which can alter the legitimacy of this metric:

The danger of short corners or cutbacks is disregarded
Difficulties in establishing the phases of corners
It disregards defensive errors or solidity, which can affect the outcome of the metric. Also zonal defence vs man-marking.
Tagging wrong events
Lot of variables in how to generate the metrics, and whether I should have used t-scores over z-scores because the sample data is relatively small.

What’s next: Free kick Threat Score (FKTS)

Inspirations:
https://bengriffis.com/
https://karun.in/blog/expected-threat.html
https://www.gettingbluefingers.com/