Tag Archives: analytics

Liverpool Season Preview 2014/2015

Before we go into the expected line up, signings and outgoings let’s deal with the giant buck-toothed elephant in the room. The big questions that have the experts, I feel, under-valuing Liverpool for the forth-coming season, are:

1. Can they cope with the loss of Luis Suarez, and the goals, assists and key passes that go with him.

2. The added fixtures that come with being back in the elite of the Champions League. (We won’t really know the answers to this question until the competition gets underway)

First off, it’s been done to death. There is no replacing Luis Suarez, at least not directly. But there are other ways you can crawl those missed 31 goals back.


For example, in defence. Simply, you can start of by conceding less shots. 7 teams conceded more shots than Liverpool last season, but 4 of those teams you could consider direct rivals this season, they were United, Chelsea, City and Spurs. However, when you consider shot location, type of shot etc the expected (non pen, non own goal) goals Liverpool should have conceded was 38, they conceded 42. So a slight under-performance. Incidentally, Chelsea fared best with 31 XG, with City next at 32.

If you look a little deeper and look at shots conceded in the danger zone then Liverpool conceded only 26 more shots than Chelsea in that zone. A difference of about 4.6 expected goals. About 20 shots more conceded in both left and right wide in the box also, which has a conversion rate of c.4%, and lastly around 40 shots more than Chelsea and City conceded from outside the box, shots which also have a very low probability of scoring. So all in all, Liverpool conceded a lot more shots than their rivals, but those shots tended to be very low value shots. In fact, only Chelsea conceded more shots from OPTA’s big chances metric last season than Liverpool. If you take a quick glance at the graphic below, it’s clear there are a lack of red (very high value) shots conceded.

Home on the left. Away right.

LFC Shots Conceded 13/14 – Home on the left. Away right.

I’m not saying this was a good defensive performance by any means, in fact, it’s a worry, volume-wise, to concede so many shots as it says to me, that tactically, Liverpool aren’t set up correctly when they lose the ball. Liverpool players blocked 132 shots last season, conceded 4 own goals, 4 penalties and conceded a lot of shots from the zone outside the box. They were too lose in midfield, and when opposition attackers did get beyond the midfield zone, defenders were forced into mistakes.

You only have to look at the ball error numbers to see that. Only Spurs had more errors that led to a goal last season, and no team conceded more shots from errors last season than Liverpool. But what about opportunity? Liverpool had lot’s of possession so you’d expect them to have more errors. Well Chelsea, City and Arsenal had a lot of possession too, but didn’t incur near this amount of errors. And if you look at it in ‘touches per goal error’, Liverpool made a goal error every 2,611 touches. Only Spurs and Norwich made an error more often in terms of touches. The thing is I’m not too sure whether those errors were as a direct result of players just being sloppy, or whether it was a more systemic issue that permeated throughout the team. I’d be inclined to think it was a little bit of both.

Of course we can’t talk about the defence without considering the Achilles Heel, set pieces. Liverpool conceded 11 goals from headers last season (Chelsea & City conceded only 5), only WBA, Fulham and Cardiff conceded more. 2 of those teams were relegated. Liverpool’s opponents converted 13.4% of their headed shots, only Stoke’s opponents converted a higher proportion. So that says it all really. Again, I think these are both systemic and personnel issues. But both can be improved and used as a way of pulling back some of those 31 goals lost by the departure of Suarez.

So how do Liverpool fix these issues? System changes, tweaks, tightening up the midfield, and work on the training ground go a long way to ironing out defensive issues. Change of personnel is another way. Hence, the defensive additions of Lovren, Moreno, Manquillo and to a lesser extent Emre Can, who can fill in at left back and defensive mid. Full back issues were also a big problem last season. Glen Johnson’s defensive positioning is as shaky as a drunk baby stumbling around a playpen, and for a supposed attacking full back his offensive output is poor compared to other full backs.

Full Backs 13/14

Full Backs 13/14

The centre backs never looked happy, Skrtel, who actually was quite poor initially gradually played himself into some kind of form, but neither Agger, or Toure looked comfortable. Sakho at times, perhaps looked the most comfortable, and at a 16 million outlay, you’d have to think that eventually he will be first choice with Lovren.

But will the new defensive additions bring more solidity? Along with Cahill and Terry, Lovren was perhaps the best centre back in the league last year. In fact, his style of defending reminds me a lot of Sami Hyppia. Positionally sound, a good reader of the game, and a commanding presence in the centre of the box (Remember those set pieces Liverpool concede from). You can get a quick idea of what he might bring to the Liverpool compared to current centre backs from the below chart.

Centre Backs Compared

*Adjusted defensive metrics – I’ll write a longer piece on this soon. I’ve adjusted each defensive metric (where you see adj pre-fixed) based on the number of passes conceded by the team each player plays for while that player was on the pitch. I’ve only looked at games where a player has played >75 mins.

Manquillo will likely be eased in, but I expect Moreno will get much more game time. Him and Flanagan will most likely alternate quite a bit based on the opponents Liverpool will be facing home/away.

Having said all of this, I somehow feel the catalyst to Liverpool improving defensively is tightening up in midfield. Gerard offers so much, but he needs runners in alongside him to help out with the defensive side. Henderson provides that, and more, but with moving to a 4-4-2 diamond last season to accommodate two centre forwards, it gave Henderson that little bit too much to do.  I’d expect Rodgers to return to more of a 4-3-3 this season. Gerrard at the base of a midfield triangle with 2 runners either side in Henderson and possibly Emre Can. If that sacrifices attacking play too much, Coutinho is a possibility as the left sided midfield player in the three. He showed his battling qualities last season playing to the left of the diamond.

Taking all of this into account, can Liverpool claw back the Suarez goals in defence? Well certainly not the full amount, but I think they have addressed their needs in the transfer market of full backs, and a commanding centre back. Fix those systemic issues and I can’t see why they can’t improve their goals conceded by at least 10.


The huge conversion rates maintained last season will inevitably drop this season. I think the big question here is  by how much? Both Suarez and Sturridge hugely over-performed in expected goals. Such over-performance has practically no year on year correlation. It’s also worth noting that Liverpool may not HAVE  to score that many goals to do well. Average goals scored by the Premier League winners in the last 10 years is 84 goals. Which is 17 less than they scored last season, but given the attacking talent on display at City and Chelsea I can’t see the winners scoring less than 90 goals. Furthermore, Man City also hugely over-performed in XGoals, so I expect their number of goals will also decrease in the coming season.

Over / Under Performing XGoals 13/14

Over / Under Performing XGoals 13/14

It can’t be over-emphasised enough, what a record breaking season that was from Suarez last season.  But not just his goals will be missed, his all round play, dragging defenders out of position, assists and link up. His goal involvement P90 (goals+assists Per 90) was at 1.31 last year. In the last 4 seasons in the Premiership only 1 player can better that tally, which was Aguero last season at 1.36, and only 7 players in the last 4 seasons have broken the 1 per 90 barrier. A huge contribution.

But there are some really positive signs in an attacking sense from Liverpool. Sturridge has grown into his role at the club. In his last 4 seasons in the EPL he’s only under-performed in XGoals once, which was a slight under-performance of 0.001 per 90 in 12/13. Coupled with his XGoals per 90 in the last 4 seasons at 0.58, 0.385, 0.718 and last seasons 0.571 gives him an average of 0.56 per season. If he can stay fit, play the majority of games and score at the rate an average player would given the chances he gets then he’s likely to get c.20 goals this season. And herein lies the problem. If Sturridge gets injured who’s going to replace him? Lambert’s XGoals per 90 in the last 2 seasons was 0.287 and 0.364, so any long term absence from Sturridge may be critical to Liverpool’s goal scoring. They can solve this by dipping into the transfer market. Names such as Cavani, Benzema and Falcao have been thrown around. If Sturridge gets injured then I believe getting a quality striker in before the season starts may be the difference between struggling to get into the Champion’s League places and being relatively comfortable in the Champion’s League places.

Markovic and Lallana have also signed, but I can’t help feeling Markovic may be a little slow getting off the ground and Lallana was signed to add depth to the squad rather than displace one of Suarez, Coutinho or Sterling in the starting eleven. My biggest worry in terms of goal scoring however can be summarised in this chart. Over/Under performance figures are marked on the labels.

Expected Goals by Position 13/14

Expected Goals by Position 13/14

In particular the midfield area. Over the last 3 seasons Liverpool have under-performed in expected goals. While not a huge problem per se, in a year when you lose your top goalscorer (and vitally haven’t bought another striker) and have a huge over-performance in expected goals it’s a imperative you get as much from your midfield as possible. Can (no pun intended) Markovic, Lallana, Sterling and Coutinho step up their goal scoring performance. Particularly Coutinho who only scored 5 goals last season and who’s shooting was erratic to say the least. We know both Sterling and Coutinho can create, both were in the top 20 expected assists per 90 last season in the EPL. So creating chances won’t be a problem for Liverpool, converting them might be though. Incidentally, both have looked unstoppable in pre-season games.

Key Pass Origins 13/14

Key Pass Origins 13/14

In summary: the defensive personnel have been improved, systemic issues should have been addressed in pre-season, and a move back to an extra man in midfield should shore up that zone. Squad depth has been improved, creativity shouldn’t be an issue, but a striker hasn’t been purchased yet, that leaves a lot of goal-scoring responsibility on Sturridge and Lambert.

Lastly, there has been some suggestion last year’s title challenge was some sort of fluke. While no one expected it, the rise into the top 4 certainly wasn’t a fluke. Liverpool’s expected goal ratio, expected goal ratio which I found to have strong correlation (R2=0.78) with points earned, has risen since the 10/11 season where they posted a XGR of just 53.8, they’ve had an XGR of 0.636, 0.637 and last seasons 0.655 since that poor 10/11 season. In fact, they are the only team to have an XGR >0.60 to finish outside the top 4 (twice) in the last 4 seasons. It’s almost like there was a plan in place.

Prediction 3rd.

Star Man: Raheem Sterling 




Testing Repeatability – Player Level

So yeah, this is just going to be a quick post to deal with some house-keeping. I’ve run a series of tests to check the repeatability of the various metrics I use. These are all done at player level, I plan on doing the same at team level at some stage. There will be no fancy Tableau graphics here! Just plain old Excel scatter plots. So here is a rundown of what I found. These may, or may not be useful for somebody.

GPS – Goal Probability Per Shot per 90

GPS (Expect goals/non-pen Shots)

Expected Goals per 90

Expected Non-Pen Goals Per 90

Expected Goal Difference Per 90

EXPGoalDiffP90 (Actual Goals-EXPGoals) Top 4 Leagues

Expected Goals From Shot Placement per 90

EXPGoals From Shot Placement (on target shots)

Expected Goals Shot Placement Difference Per 90

XGSPDiff P90 (Actual Goals-XGSP)

Expected Goals Shot Placement per Shot per 90

XGSP_GPS (XGSP/non-pen shots)

Shot Placement Extra Goals per 90 (SPEG)



Expected Goals – Shot Placement

After the Premier League season ended last year I was wondering why there aren’t more shot placement models out there. There has been some work done on it over at www.statsbomb.com but nothing I could find of note since. I was surprised by this, because if you want to measure finishing skill, isn’t shot placement (along with technique and other variables) a rather large part of a player’s goal scoring armoury. There doesn’t seem to be ‘technique’ data available, at least not in the public domain, and I don’t even know if OPTA (or anyone else) collect data regarding how a player hits the ball? e.g. toe poke, instep, volley etc. It strikes me that EXPGoals is just a “quality of chance from shot location” measurement, and doesn’t directly deal with finishing skill. Indirectly you can measure the difference between actual goals scored and expected goals (which I have done, and gives you an EXPGoalDiff + or -) which could indicate whether or not a player is better than the average player at converting his chances into goals, but for me that’s taking a big leap forward, without understanding why one player scores more than the average. Like others I have found no year on year correlation for over-performing in expected goals, with an R2 of just 0.002. So EXPGoalDiff can tell you what may have happened in a particular season, but has no predictive powers of what might happen in the next season.

XGDiffP90 (Actual Goals-EXPGoals)

EXPGoals deals with variables up until the moment the player touches the ball to shoot. But a lot can happen between touching the ball and ending up in the back of the net. How the ball is hit, with bend, without bend, velocity, shot placement and external factors such as weather and opposition player positions etc etc. Even if EXPGoal difference was repeatable, it could INDICATE finishing ability, but it won’t tell us why, and I like to understand things, so the why really bugs the hell out of me.

Shot Placement With all these other factors I mentioned I think shot placement data is the only variable that is in the public domain, and even then only the top 5 Leagues over the last 2 seasons. So after the EPL season finished last year I started collecting shot placement data. That was quickly put on hold during the World Cup, but since then I’ve been beavering away. I managed 4 of the top 5 Leagues, France will have to wait, I just didn’t have the staying power. Sorry France. Upon finishing I got to work on the shot placement model and connecting the data between EXPGoals and shot placement. My idea was, that I wanted to control for the exact same variables as the EXPGoal model. That way I’d could compare the same shot from both perspectives. i.e. I’d have an expected goal value just before the shot was struck, and an expected goal value after the shot was struck. I could then see the difference between the two values and by that, know how much any individual player had increased/decreased their chances of scoring, just by where they placed the ball in the goal. I’d also be controlling for a whole host of actions related to shooting and thus hopefully get some decent outputs. And as I’m writing, I tweeted about shot placement models and have just been tweeted this, which is a piece by Devin Pleuler; http://www.optasportspro.com/about/optapro-blog/posts/2014/on-the-topic-of-expected-goals-and-the-repeatability-of-finishing-skill.aspx And there I was thinking I had an original idea.

EXP Goal Zones

Obviously off target shots can’t be scored and as such have an expected goal value of zero so won’t be included. I took all on target shots, and controlled for the same inputs (location, type of shot etc) as my EXPGoal model, with the added qualifier of separating each instance into separate parts of the goal.

Goa Sections

I divided the goal up into 6 boxes, see above, and got an EXPGoal value for each location in the goal. Why these boxes specifically? I needed at least 6 to delineate from central and corners, but couldn’t go any more than 6 as I’d run into sample size issues. Ideally you’d probably want at least 10 areas, an extra 2, top and bottom, either side of the central boxes. But like I said, sample size issues, and each box added creates a mountain of extra work. Let’s just take a quick example of an instance: one specific instance could be, all non-headed on target shots taken from Zone C and placed in the top right corner of the goal, which are converted at 60%. (Or an XGSP value of 0.60) I done the same for each section of the shot placement area, top left, top centre and so on. The same for headed shots in Zone C, and for every other zone marked on the pitch above. This, took a lot of bloody time, and I have to admit I nearly gave up on more than one occasion. Now each shot on target has an expected goal value before the shot is struck and after the shot is struck.

On to those messy acronyms. For want of a better name, I’m going to call it Expected Goals Shot Placement or XGSP for short. Lets first take a look at whether XGSP-P90 correlates to GoalsP90.


A pretty strong correlation at 0.771, which is what you would expect, the better your shot placement the more goals you should score.

Shot Placement Extra Goals Now I’m going to introduce another pesky new acronym, SPEG, or Shot Placement Extra Goals, which is just the difference between expected goals (from on target shots, pre-shot – based on location, type of shot etc) and expected goals from shot placement (post-shot – based on all of the variables in EXPGoals, with shot placement added in). I’ve leant away from using ‘finishing skill’ as a name, because for me it’s not finishing skill, as I believe finishing skill incorporates a whole host of different skills, and shot placement is just one of those skills.

So at a basic level, over a full season, if we look at each shot a player takes, give it an EXP goal value pre-shot, then give an Exp goal value post-shot, based on shot placement, and if that player can show that they have increased their probability of scoring, just by their shot placement, doesn’t that show some skill at putting the ball in the back of the net? It should do, but we could run into the same problems as EXPGoalDiff and things like Shot Conversion %. They just aren’t very repeatable year on year. I ran two tests, firstly on just the EPL alone (because I needed to test before I continued collecting data for other leagues) in the last two seasons, where R2=0.47 and then I tested the Top 4 leagues, (EPL, La Liga, Bundesliga, Serie A) where player x had >=10 shots in year N and year N+1 and here’s what I found.

Shot Placement Extra Goals

An R2 of 0.427, while probably not a good result in any other type of metric is significant enough when it comes to conversion/goal scoring. Certainly enough to warrant more investigation. Ideally I’d like to go back at least 5 seasons to test it, but still, there is some shot placement skill evident, and these are just my initial findings, so I haven’t had much time to digest the implications. I also decided to do so some further visual tests to see if things are what they seem. As a side note, the huge outlier at 0.9 is Morata. I was wondering the same myself.

Visual Tests If you follow me on Twitter you’ll know I like to post these scatter plots which I call dashboards. I like the fact that they can show 4 or 5 different metrics at any one time. I mostly plot them with similar type metrics that give some context to all the metrics as a whole. Here I’ve plotted EXGoalsP90 on the vertical axis and SPEGP90 (shot placement extra goals) on the horizontal. GPS, or expected goals per shot is coloured, and goalsP90 (output is also important!) is referenced by the size of the coloured circles.


Visually, SPEGP90 looks good, the players who you’d expect to do well are doing well. It’s encouraging that the likes of Messi, Ronaldo, Suarez, Dzeko, and Sturridge all appear above 2 standard deviations in both metrics for both seasons.

Edge Case – Mertens Ok, so that’s good, lets take a look at some edge cases (apologies – that’s the programmer in me coming out) or outliers and see what we find. First of all, Dries Mertens. Colour-wise he’s in the kind of blue-green range which means on a per shot basis he has low expected goals, and on a per 90 basis he’s also going to be low. His shots per 90 are at 3.9 so that’s quite high. So lots of shots, but low value chances of converting, which usually means shots from outside the box. But his SPEG-P90 is above 2 standard deviations which indicates that by way of his shot placement he’s increased his expected rate of scoring somewhat. Visually, lets see what that looks like. First his shot chart from last season, remember, it’s heat map orientated, the hotter the shot the higher chance of converting and vica-versa for the colder shots. Larger dots represent goals, X’s represent headers.

Mertens Non-Pen Shots

Pretty much as expected based on his GPS and XGP90 on the scatter plot. Lots of shots from outside the box, that have an obvious low scoring probability. Next let’s take a look at his shot placement.

Mertens Shot Placement

Before we even consider the numbers, visually, if you look at the sheer volume of his low value chances on his shot chart above (blue shots), then compare his shot placement it looks quite good. Only 5 of his 36 shots on target where placed down the centre. 26 of his on target shots had an expected goal value (pre-shot) of less than 0.06, yet after the shot was taken 30 of those on target shots had a SPEG value of greater than 0.089. So yeah, in this instance you could say his SPEG numbers match what is happening visually.

Edge Case – Destro Lastly lets take a quick look at another outlier. Destro in the 12/13 season, he’s in the top left of the plot above. Here’s his shot chart.

Desto Non-Pen Shots

GPS and EXPGoals both indicated high value chances and it’s clear from his shot chart that most of his shots came from prime central in the danger zone. Only 2 of Destro’s shots came from outside the box and 12 of his 22 shots on target had an expected goal value greater than 0.30. High quality chances indeed. But his SPEGP90 indicates he increased is expected probability of scoring by 0.167 per 90, whilst the average increase over the plot is 0.133. So he’s slightly above average, which is not really that good. Let’s look at his shot placement chart.

Destro Shot Placement

Again visually, it seems clear that the reason his increase from expected goals (pre-shot) to SPEG (post shot) is low, is because he hit most of his shots low centre, which is really goalkeeper territory and has a much lower chance of being scored. It’s still early days, but it’s nice to know the model is working as it should be, and that the numbers, for now, pan out visually.

Future Improvements Well the inputs in the model probably won’t be improved much as I can’t sub-divide the categories any further without running into sample size issues. Not to mention the enormous amount of work it would involve to tinker with it in that way. In fact, I spent so much time on it I’m fed up looking at the numbers at this stage. For now it will interest me, just to to use it for the coming season and see what I can learn and what it’s best application is. I have no formal statistical training or background, so this is a hobby, and a very time-consuming one at that. I’ll continue collecting the data and input it into the model for the coming season, but if it comes too much of a burden I’ll have to stop.

In a visual sense, I would like to connect both shot charts and shot placements in the goal to show the increase before and after the shot has been taken. The holy grail would be in some sort of 3D environment, but that would take an awful lot of coding and again I’m not sure I have the time.

What I would like to do before the season starts is look at SPEG at a team level. I’m aware though that shot placement is really an individual based skill, but I think it might be interesting to discover what it says at a more macroscopic level. In particular, SPEG conceded, and maybe SPEG total shot ratio. Though I’m not that hopeful of either being that predictive, nonetheless, it will be fun to find out. I think.

Feedback welcome, as I got so caught up with this I might have missed something that’s just plain obvious.