Down the xG Rabbit Hole

This post on Twitter the other day really sent me down a rabbit hole for a good portion of the past two days.

I think evaluating and comparing models is both interesting and valuable. Where do models differ? Where are they the same? Do the differences expose potential biases in one or both models?

However this comparing expected goal models is a bit more interesting to me because I have my own expected goals model and I developed it differently than the other two. I won’t go into my methodology here but it is definitely different than Evolving Hockey‘s and I am pretty certain is different from SportLogiq’s. The one advantage that SportLogiq has over Evolving Hockey and myself is that they have access to pre-shot information that we do not such as whether the shot was off the rush, whether it was an odd man rush (1 on 0, 2 on 1, 3 on 1, 3 on 2, etc), and whether there was a pass that led to the shot, etc. Having developed an expected goals model using SportLogiq data when I was with the Calgary Flames I know there is a lot more information that you can include in the model. That ought to be an advantage however the above tweet indicates it may not be all that significant.

My first objective was to look at how my expected goals model compared to SportLogiq and Evolving Hockey. The SportLogiq data was as of Monday April 1st while I pulled data from Evolving Hockey and Puckalytics on Wednesday April 3rd. The date difference should not impact results in any significant way as we are looking at per game data.

Here is my recreation of the chart in the tweet along with comparisons of the Puckalytics expected goal differential with SportLogiq and Evolving Hockey:

All models appear to agree fairly well, however the greatest deviation appears to be between SportLogiq and Puckalytics.

When I do comparisons like this I like to look at the team by team comparisons to see which teams have the greatest discrepancies and whether there are certain ‘types’ of teams (i.e. volume shooting teams, defensive teams, etc.) that are leading to these discrepancies.

The above chart shows the expected goal differential per game of each of the models. Teams are sorted by average expected goal differential across the three models.

Here is another way to visualize the models by looking at each model separately.

First, a couple of obvious observations. Each of the three models has the Oilers with the highest expected goal differential and all three models also agree on the bottom 4 teams (San Jose, Anaheim, Chicago and Montreal). In general there seems to be pretty good agreement between the models.

However, the Puckalytics model tends to have a tighter distribution than the other two models. I calculated a standard deviation on the data and the Puckalytics model had a standard deviation of 0.43 while SportLogiq and Evolving Hockey were 0.54. The Sportlogiq model has the Oilers with an expected goal differential 2.39 goals per game higher than the San Jose Sharks. Evolving Hockey is at 2.31 while the Puckalytics model is much lower at 1.79.

We saw above that SportLogiq and Evolving Hockey had the best agreement among the three models. Their biggest disagreements were Colorado, where SportLogiq had a 0.65 goal differential compared to 0.35 for Evolving Hockey, and Dallas, where SportLogiq was at +0.51 and Evolving Hockey was at +0.74.

So far I have only looked at goal differential because that is the SportLogiq data we have. I want to break this down a bit more and look at expected goals for and expected goals against separately however I can only do that for Puckalytics and Evolving Hockey. Here are the comparisons for xGF and xGA per game.

That is a reasonably good comparison but there are some differences. However, it was when I looked at the absolute difference between between the two models when some red flags were raised. The biggest difference in xGF/Game was the Edmonton Oilers where Evolving Hockey had the Oilers with an all situation xGF/Game of 3.97 compared to Puckalytics at 3.50. The smallest difference was the Anaheim Ducks where Evolving Hockey was 2.69 per game and Puckalytics was at 2.66 per game. For the Oilers Puckalytics was 0.47 below Evolving Hockey and for Anaheim Puckalytics was 0.03 below Evolving hockey.

The red flag here is that there was not one team where Puckalytics had a higher xGF/Game than Evolving Hockey. There was also not one situation where Puckalytics had a higher xGA/Game than Evolving Hockey. Between these two models, Evolving Hockey is always predicting more expected goals than Puckalytics. So there are clear differences between the two models.

As of Wednesday there were 7364 actual goals scored in the NHL this season. As of Wednesday Evolving Hockey had a total expected goals of 7808, or 444 expected goals above actual. That is a 6% overshoot. Puckalytics on the other hand had a total expected goals of 7237, or 127 expected goals below actual. That is a 1.7% undershoot. I just looked at Natural Stat Trick for today and as of today they are at a 0.7% overshoot so best of the three models

I wanted to break this down by situation so I pulled 5v5, 5v4 and 4v5 data from the three sites. Here is what I found in terms of which expected goals models over shoot or under shoot actual goals.

SituationEvolving HockeyPuckalyticsNatural Stat Trick
5v5+4.7%-0.8%+2.3%
5v4+14.9%+2.4%+8.1%
4v5-10.1%-9.7%-6.2%
Of the three models, Evolving Hockey seems to be the poorest performing while Puckalytics is the best at 5v5 and 5v4 with Natural Stat Trick best at 4v5.

Overall I am very happy with my Puckalytics xG model. I think there are a few minor tweaks I want to make but probably nothing that will change the performance much. The Natural Stat Trick model performs fairly well too. The Evolving Hockey model seems to over estimate expected goals but when viewed in goal differential format it performs fairly well. The over estimating expected goals shows up most when you look goals above expected or goals saved above expected. For example, on Evolving Hockey there are 71 goalies that have played 500 or more minutes and all but 22 of them have a positive goals saved above expected. One would expect this to be closer to half the goalies.

I would really like to see the expected goals for and expected goals against from the SportLogiq model so I can better evaluate how the public model compares to it but from the goal differential analysis above the public models compare fairly well.

1 thought on “Down the xG Rabbit Hole”

Leave a Comment