Team Zone Entry Data and Predicting Standings
I am sure many of you are aware that Corey Sznajder (@ShutdownLine) has been working on tracking zone entries and exits for every game from last season. A week and a half ago Corey was nice enough to send me the data for every team for all the games he had tracked so far (I’d estimate approximately 60% of the season) and the past few days I have been looking at it. So, ultimately everything you read from here on is thanks to the time and effort Corey has put in tracking this data.
As I have alluded to on twitter, I have found some interesting and potentially very significant findings but before I get to that let me summarize a bit of what is being tracked with respect to zone entries.
- CarryIn% – Is the percentage of time the team carried the puck over the blue line into the offensive zone.
- FailedCarryIn% – Is the percentage of the time the team failed to carry the puck over the blue line into the offensive zone.
- DumpIn% – is the percentage of the time the team dumped the puck into the offensive zone.
The three of these should sum up to 100% (Corey’s original data treated FailedCarryIn% separately so I made this adjustment) and represent the three different outcomes if a team is attempting to enter the offensive zone – successful carry in, failed carry in, and dumped in.
I gathered all this information for and against for every team and put them in a table. I’ll spare you all the details as to how I arrived at this idea I had but here is what I essentially came up with:
- Treat successful carry ins as a positive
- Treat failed carry in attempts as a negative (probably results in a quality counter attack against)
- Dump ins are considered neutral (ignored)
So, I then came up with NetCarryIn% which is CarryIn% – FailedCarryIn% and I calculated this for each team for and against to get NetCarryIn%For and NetCarryIn%Against for each team.
I then subtracted NetCarryIn%Against from NetCarryIn%For to get NetCarryIn%Diff.
In all one formula we have:
NetCarryIn%Diff = (CarryIn%For – FailedCarryIn%For) – (CarryIn%Against – FailedCarryIn%Against)
Hopefully I haven’t lost you. So, with that we now get the following results.
‘Playoffs’ indicates a playoff team and RegWin% is their regulation winning percentage (based on W-L-T after regulation time).
What is so amazing about this is we have taken the first ~60% of games and done an excellent job of predicting who will make the playoffs. The top 8 teams (and 11 of top 12) in this stat through 60% of games made the playoffs and all of the bottom 8 missed the playoffs. That’s pretty impressive as a predictor. What’s more, the r^2 with RegWin% is a solid 0.42, significantly better than the r^2 with 5v5 CF% which is 0.31. Here are what the scatter plots look like.
I think what we are seeing is that if you are more successful at carrying the puck into the offensive zone, but not at the expense of costly turnovers attempting those carry ins, than your opponent you will win the neutral zone and that goes a long way towards winning the game. Recall that I have shown that shots on the rush are of higher quality than shots generated from zone play so an important key to winning is maximizing your shots on the rush and minimizing your opponents shots on the rush. To an extent this may in fact actually be measuring some level of shot quality.
Of course, why stop here. If it is in fact some sort of measure of shot quality, why not combine it with shot quantity? To do this I took NetCarryIn%Diff and add to it the teams Corsi% – 50%. This is what we get.
|Team||Playoffs?||NetCarryIn%Diff – CF% over 50%|
New Jersey still messes things up but New Jersey is just a strange team when it comes to these stats. But think about this. If New Jersey and Ottawa made the playoffs over Philadelphia and Montreal it would have a perfect record in predicting the playoff teams. It was perfect in the western conference.
Compared to Regulation Win Percentage we get:
That’s a pretty nice correlation and far better than corsi% itself.
Now, this could all be one massive fluke and none of this is repeatable but I am highly doubtful that will be the case. We may be on to something there. Will be interesting to see what individual players look like with this stat and I’ll also take a look at whether zone exits should somehow get factored in to this equation. I suspect it may not be necessary as it may be measuring something similar to Corsi% (shot quantity over quality).