Using goalies to estimate zone start impact on corsi
Eric T. over at NHL Numbers had a post last week summarizing the current state of our statistical knowledge with respect to accounting for zone start differences. If you haven’t read it definitely go read it because it is not only a good read but because it concludes that how the majority of people have been doing is is wrong.
Overall, no two estimates are in direct agreement, but the analyses that are known to derive from looking directly at the outcomes immediately following a faceoff converge in the range of 0.25 to 0.4 Corsi shots per faceoff — one-third to one-half of the figure in widespread use. It is very likely that we have been overestimating the importance of faceoffs; they still represent a significant correction on shot differential, but perhaps not as large as has been previously assumed.
In the article Eric refers to my observation that eliminating the 10 seconds after a zone start effectively removes any effect that the zone start had on the game. From there he combined my zone start adjusted data found at stats.hockeyanalysis.com with zone start data from behindthenet.ca and came up with an estimate that a zone start is worth 0.35 corsi. He did this by subtracting the 10 second zone start adjusted corsi from standard 5v5 corsi and then running a regression against the extra offensive zone starts the player had. In the comments I discussed some further analysis I did on this using my own data (i.e. not the stuff on behindthenet.ca) and came up with similar, though slightly different, numbers. In any event I figured the content of that comment was worthy of its own post here.
So, when I did the correlation between extra offensive zone starts and difference between 5v5 and 5v5 10 second zone start adjusted corsi I got the following (using all players with >1000 minutes of ice time over last 5 seasons):
My calculations come up with a slope of 0.3043 which is a little below that of Eric’s calculations but since I don’t know the exact methodology he used that might explain the difference (i.e. not sure if Eric used complete 5 years of data, or individual seasons).
What is interesting is that when I explored things further, I noticed that the results varied across positions, but varied very little across talent levels. Here are some more correlations for different positions and ice time restrictions.
|All Players >1000 min.||0.30||0.55|
|Skaters >1000 min.||0.28||0.52|
|Forwards >1000 min.||0.26||0.50|
|Defensemen >1000 min.||0.33||0.57|
|Goalies >1000 min.||0.44||0.73|
|Forwards >500 min.||0.26||0.50|
|Forwards >2500 min.||0.26||0.52|
|Forwards 500-2500 min.||0.26||0.39|
1. The slope for forwards is less than the slope for defensemen which is (quite a bit) less than the slope for goalies.
2. There is no variation in slope no matter what restrictions we put on a forwards ice time.
There isn’t really much to say regarding the second observation except that it is nice to see consistency but the first observation is quite interesting. Goalies, who have no impact on corsi, see the greatest zone start influences on corsi of any position. It is a little odd but I think it addresses one of the concerns that Eric had pointed out in his article:
The next step would be to remove the last vestige of sampling bias from our analysis. The approaches that focus on the period immediately after the faceoff reduce the impact of teams’ tendency to use their best forwards in the offensive zone, but certainly do not remove it altogether.
I think that is exactly what we are witnessing here, but maybe more importantly teams put out their best defensive players and, maybe more importantly, their best face off guys for defensive zone face offs. If David Steckel, who is an excellent face off guy, is getting all the defensive zone face offs, it is naturally going to suppress the corsi events immediately after the defensive zone face off because he is going to win the draw more often than not. There is probably more line matching done for the zone face offs than during regular play so the line matching suppresses some of the zone start impact. It is more difficult to line match when changing lines on the fly so a good coach can more easily get favourable line matches. The result is normal 5v5 play offensive players might see a boost to their corsi (because they can exploit good matchups) and during offensive zone face offs they see their corsi suppressed because they will almost always be facing good defensive players and top face off guys. Thus, the boost to corsi based on a zone start is not as extreme as should be for offensive players. The opposite is true for defensive players.
Defensemen are less often line matched so we see their corsi boost due to an offensive zone face off a little higher than that of forwards, but it isn’t near as high as goalies because there are defensemen that are primarily used in offensive situations and others that are primarily used in defensive situations.
Goalies though, tell us the real effect because they are always on the ice and they are not subject to any line matching. In the table above you will notice that goalies have a significantly higher slope and an impressively high r^2. I feel I have to post the chart of the correlation because it really is a nice chart to look at.
I have looked at a lot of correlations and charts in hockey stats but very few of them are as nice with as high a correlation as the chart above.
I believe that this is telling us that an offensive zone start is worth 0.44 corsi, but only when a player is playing against similarly defensively capable players as he would during regular 5v5 play which I speculate above is not necessarily (or likely) the case. The 0.44 adjustment really only applies to an idealistic situation that doesn’t normally occur for any players other than goalies. So where does that leave us? Should we use a zone start adjustment of 0.44 corsi for all players, or should we use something like 0.33 for defensemen and 0.26 for forwards? The answer isn’t so simple. One could argue that we should apply 0.44 to all players and then make some sort of QoC adjustment and that would make some sense. But if we are not intending to apply a QoC adjustment, does that mean we should use 0.33 and 0.26? Maybe, but that is a little inconsistent because it would mean you are using a QoC adjustment only for the zone start adjustment of a players stats, and not for all his stats. The answer for me is what I have been doing the past little while and not even attempt to adjust a players stats based on zone starts differences and rather simply just ignore the the portion of play that is subject to being influenced by zone starts – the 10 seconds after a zone start face off. To me it seems like the simplest and easiest thing to do.