Sep 262014
 

Last night on twitter I posted some GF%RelTM statistics which resulted in a number of comments but notably some from Stephen Burtch about how players cannot be blamed for GF% and is nothing more than a fancy +/- stat and how players can’t be blamed or given credit for things such as save percentage.

It isn’t just Burtch that has this sentiment. In an article on ArcticIceHockey.com HappyCaraT writes that “+/- is a stat that is pure luck.” There has been a lot of bashing of +/-, some fair, some overblown, and the result is this kind of sentiment. To suggest that +/- or some similar stat is all luck and has no validity or usefulness is just silly. Yes +/- is heavily team driven but so is Corsi and nearly every other NHL statistic so that is no reason to toss it aside. You just have to take that into consideration and look at things like ‘Rel’ stats and WOWY analysis. Yes it is impacted by luck and randomness but given large enough sample sizes that is largely mitigated and given large enough sample sizes it is predictive of future performance.

Now, to address Burtch’s specific comment about on-ice save percentage I don’t understand why anyone believes players cannot influence it. I have written about this before but we know players can impact save percentage because score effects are real. When players are protecting a lead they give up more shots but they end up as goals at a smaller rate while presumably playing against the oppositions best offensive players who definitely have better shooting percentages overall. Luck doesn’t only happen when you are protecting a lead and bad luck doesn’t always happen when you are trailing.

Furthermore, in recent months the following have been discovered:

These two observations taken together implies that the players that are better at minimizing clean zone entries against effectively should be able to boost their goalies on-ice save percentage. Who was the best Leaf defenseman in terms of limiting successful zone entries against on the Leafs last season? Dion Phaneuf. Who on the Leafs had the best Save%RelTM last year? Gunnarsson, who played mostly with Phaneuf. Phaneuf was a close second. In fact, over the past 4 seasons Phaneuf’s Save%RelTM has been +1.3%, +1.8%, +1.6% and +2.1%. Pretty consistently good. Is it a coincidence that a defenseman who is good as limiting successful zone entries against is good at boosting their goalies save percentage? I suspect not.

Now, what about Polak. Well, he has been -1.7, -2.4, -0.7, and -1.1. Not so good. Robidas has been -3.1, -3.5, -0.6, and -2.1. Wow, look at that. It’s a trend, and not a good one. Should we be predicting a tougher season for Maple Leaf goalies? Probably so.

When I get more time (currently working on my new website where you’ll get access to these RelTM stats) I’ll do some more research into studying the connection between zone entries against and save percentage. Until then I think there is at least some good evidence to support that limiting zone entries against is a big factor in being able to boost your on-ice save percentage (as well as your goalies save percentage).

So, can we please get past the idea that a statistics like GF% or GF%RelTM has zero merit and that all hockey analytics must be done using Corsi or Fenwick? Are there special concerns that need to be considered with these statistics? Sure, but calling them irrelevant, all luck, and not useful is the kind of thinking is only going to limit progress in hockey analytics. Shot quality exists and its real. At both ends of the rink. To take hockey analytics to the next level we need to research it and understand it better, not continually minimize it.

Aug 262014
 

I am sure many of you are aware that Corey Sznajder (@ShutdownLine) has been working on tracking zone entries and exits for every game from last season. A week and a half ago Corey was nice enough to send me the data for every team for all the games he had tracked so far (I’d estimate approximately 60% of the season) and the past few days I have been looking at it. So, ultimately everything you read from here on is thanks to the time and effort Corey has put in tracking this data.

As I have alluded to on twitter, I have found some interesting and potentially very significant findings but before I get to that let me summarize a bit of what is being tracked with respect to zone entries.

  • CarryIn% – Is the percentage of time the team carried the puck over the blue line into the offensive zone.
  • FailedCarryIn% – Is the percentage of the time the team failed to carry the puck over the blue line into the offensive zone.
  • DumpIn% – is the percentage of the time the team dumped the puck into the offensive zone.

The three of these should sum up to 100% (Corey’s original data treated FailedCarryIn% separately so I made this adjustment) and represent the three different outcomes if a team is attempting to enter the offensive zone – successful carry in, failed carry in, and dumped in.

I gathered all this information for and against for every team and put them in a table. I’ll spare you all the details as to how I arrived at this idea I had but here is what I essentially came up with:

  • Treat successful carry ins as a positive
  • Treat failed carry in attempts as a negative (probably results in a quality counter attack against)
  • Dump ins are considered neutral (ignored)

So, I then came up with NetCarryIn% which is CarryIn% – FailedCarryIn% and I calculated this for each team for and against to get NetCarryIn%For and NetCarryIn%Against for each team.

I then subtracted NetCarryIn%Against from NetCarryIn%For to get NetCarryIn%Diff.

In all one formula we have:

NetCarryIn%Diff = (CarryIn%For – FailedCarryIn%For) – (CarryIn%Against – FailedCarryIn%Against)

Hopefully I haven’t lost you. So, with that we now get the following results.

Team Playoffs? NetCarryIn%Diff RegWin%
Chicago Playoffs 12.2% 61.0%
Tampa Playoffs 6.1% 53.0%
Anaheim Playoffs 5.9% 64.6%
Colorado Playoffs 5.5% 59.1%
Detroit Playoffs 4.7% 51.2%
Minnesota Playoffs 4.1% 53.0%
Pittsburgh Playoffs 4.0% 59.8%
Dallas Playoffs 3.8% 51.8%
New Jersey . 3.4% 48.2%
Los Angeles Playoffs 1.7% 53.7%
Boston Playoffs 1.3% 67.1%
St. Louis Playoffs 1.2% 60.4%
Ottawa . 0.9% 47.6%
Columbus Playoffs 0.7% 51.8%
Edmonton . 0.7% 35.4%
NY Rangers Playoffs -0.1% 54.9%
Phoenix . -1.3% 48.8%
Montreal Playoffs -1.3% 53.0%
Vancouver . -1.7% 43.9%
Philadelphia Playoffs -1.8% 53.0%
Winnipeg . -1.8% 43.3%
San Jose Playoffs -2.3% 59.1%
NY Islanders . -3.0% 40.2%
Toronto . -4.8% 42.7%
Nashville . -6.0% 50.6%
Calgary . -6.4% 38.4%
Washington . -6.4% 46.3%
Florida . -6.7% 35.4%
Carolina . -6.8% 47.0%
Buffalo . -7.7% 25.6%

‘Playoffs’ indicates a playoff team and RegWin% is their regulation winning percentage (based on W-L-T after regulation time).

What is so amazing about this is we have taken the first ~60% of games and done an excellent job of predicting who will make the playoffs. The top 8 teams (and 11 of top 12) in this stat through 60% of games made the playoffs and all of  the bottom 8 missed the playoffs. That’s pretty impressive as a predictor. What’s more, the r^2 with RegWin% is a solid 0.42, significantly better than the r^2 with 5v5 CF% which is 0.31. Here are what the scatter plots look like.

CarryInPctDiff_vs_RegWinPct

CFPctDiff_vs_RegWinPct

I think what we are seeing is that if you are more successful at carrying the puck into the offensive zone, but not at the expense of costly turnovers attempting those carry ins, than your opponent you will win the neutral zone and that goes a long way towards winning the game. Recall that I have shown that shots on the rush are of higher quality than shots generated from zone play so an important key to winning is maximizing your shots on the rush and minimizing your opponents shots on the rush. To an extent this may in fact actually be measuring some level of shot quality.

Of course, why stop here. If it is in fact some sort of measure of shot quality, why not combine it with shot quantity? To do this I took NetCarryIn%Diff and add to it the teams Corsi% – 50%. This is what we get.

Team Playoffs? NetCarryIn%Diff – CF% over 50%
Chicago Playoffs 17.7%
Los Angeles Playoffs 8.5%
New Jersey . 7.8%
Tampa Playoffs 7.1%
Detroit Playoffs 6.2%
Anaheim Playoffs 5.7%
Boston Playoffs 5.2%
St. Louis Playoffs 4.3%
Dallas Playoffs 4.3%
Ottawa . 3.3%
Minnesota Playoffs 2.7%
Pittsburgh Playoffs 2.7%
Colorado Playoffs 2.5%
NY Rangers Playoffs 2.3%
San Jose Playoffs 1.4%
Columbus Playoffs 0.6%
Vancouver . -0.4%
Phoenix . -0.8%
Winnipeg . -1.7%
Philadelphia Playoffs -1.8%
NY Islanders . -3.6%
Montreal Playoffs -4.6%
Edmonton . -5.0%
Florida . -5.7%
Carolina . -6.5%
Nashville . -7.5%
Washington . -8.7%
Calgary . -10.1%
Toronto . -11.9%
Buffalo . -14.7%

New Jersey still messes things up but New Jersey is just a strange team when it comes to these stats. But think about this. If New Jersey and Ottawa made the playoffs over Philadelphia and Montreal it would have a perfect record in predicting the playoff teams. It was perfect in the western conference.

Compared to Regulation Win Percentage we get:

CarryInPctDiff_CFPctDiff_vs_RegWinPct

That’s a pretty nice correlation and far better than corsi% itself.

Now, this could all be one massive fluke and none of this is repeatable but I am highly doubtful that will be the case. We may be on to something there. Will be interesting to see what individual players look like with this stat and I’ll also take a look at whether zone exits should somehow get factored in to this equation. I suspect it may not be necessary as it may be measuring something similar to Corsi% (shot quantity over quality).

 

Aug 092014
 

The other day over at PensionPlanPuppets.com there was a post by Draglikepull looking at zone exits by Maple Leaf defensemen for the first half of last season. If you haven’t seen it yet, definitely go read it. I wanted to compare the zone exit data to my rush shot data which I have calculated from play by play data as explained here. If we can find good correlations between zone entry/exit data and my rush shot data that would be an excellent finding because the zone entry/exit data need to be manually recorded and is very time consuming. Thankfully this is a project being undertaken by Corey Sznajder. If we can find useful correlations with data that can be automatically calculated we may not need to do this in the future and Corey can have a summer vacation next year.

Let’s first look at defensive zone exit percentage and how it correlates with rush and non-rush shots.

PlayerName RushCF/60 Non-Rush OtherCF/60 Exit%
MORGAN RIELLY 11.5 39.8 27.5
CARL GUNNARSSON 10.6 35.1 25.9
DION PHANEUF 10.1 37.9 25.5
JAKE GARDINER 11.2 37.7 24.8
JOHN-MICHAEL LILES 15.5 41.9 24
CODY FRANSON 10.5 36.9 23.8
PAUL RANGER 12.0 32.9 20.5
MARK FRASER 14.5 34.7 13.3

One thing to note is that my rush shot data is for the full season and the exit% data is for the first half of last year. Also, my rush shot data is only road data to eliminate arena bias and Liles and Fraser also includes their time with Carolina and Edmonton respectively.

Let’s look at some charts to more easily see if a correlation exits.

 

Leafs_dmen_DefZoneExitPct_vs_RushShotsFor

Ok, this is very counter-intuitive. The defensemen that have the best defensive zone exit percentage have a lower rush shot rate and a higher non-rush shot rate. On the surface this doesn’t make sense. If you are better at carrying the puck out of your own zone you should be able to generate more shots from the rush but that doesn’t seem to be the case. I think what is actually happening here is that to be able to carry the puck out of the defensive zone you have to be a skilled puck handler and if you are skilled with the puck you probably get more time in the offensive zone including more offensive zone starts and more ice time with offensive type forwards. Now, if you are not a good offensive defenseman you probably don’t get many offensive zone starts and get more defensive zone starts and maybe more importantly you play less with offensive minded forwards.

It should also be noted that Fraser is a bit of an anomaly here as his defensive zone exit percentage is well below anyone else’s and his rush shot rate is quite good. If we take Fraser out of the charts the relationship is much flatter and the correlations get weaker. We need to look at more defensemen to get more conclusive results though. Also, I think we will also find that we will get better results for forwards as I generally think it is forwards that drive the offense, not the defensemen.

Another factor in the non-relationship between defensive zone exits and rush shots for might be that often when a team exits the defensive zone they conduct a line change and maybe in particular a change in defensemen as the forwards are taking the puck up the ice. Defensemen may be able to get the puck out of their own end and initiate a rush but are on the bench before the benefits of the zone exit and follow-up rush have materialized. This could result in the lack of positive correlation between zone exits and rush shots. I need to create an “initiator of rush shots” statistic to account for this possibility.

In the comments of the pensionplanpuppets.com article Corey Sznajder provided statistics on  zone entries against each defenseman. Most defensemen would likely have significantly more control over zone entries against than they do for creating offense so we might find stronger correlations here.

PlayerName RushCA/60 OtherCA/60 Carry% Against Break-up %
MARK FRASER 18.9 44.2 71.4 7.1
JAKE GARDINER 14.9 42.4 67.7 6
MORGAN RIELLY 16.1 49.4 67.7 4.3
CARL GUNNARSSON 13.1 56.4 64.4 11.3
CODY FRANSON 14.8 46.6 64.1 5.7
JOHN-MICHAEL LILES 10.5 45.2 55.2 6.9
PAUL RANGER 14.8 49.2 54.7 17.4
DION PHANEUF 13.7 58.1 53.1 13.4

 

Leafs_dmen_RushShotsAgainst_vs_CarryInPctAgainst

Now this is a little closer to what we might expect. Those defensemen that have a high percentage of zone entries against being carry-in entries vs dump-ins give up rush shots at a higher rate while also giving up non-rush shots at a lower rate. There doesn’t appear to be any correlation between Carry In % Against and total corsi against per 60 (r^2=0.026) so it seems only the type of shot against is being impacted. I have observed that shots on the rush are significantly more difficult shots (shooting percentage on rush shots over last 7 seasons has been 9.56% vs 7.34% on non-rush shots making rush shots 30% more difficult on average) so players that can limit the frequency of carry-in rushes against and force dump-ins against instead are in fact likely to reduce average shot difficulty against.

The real counter-intuitive observation is that from a strategy/tactics point of view, it might be better to start your defensive defensemen (i.e. the ones that have the ability to limit rushes against) in the offensive zone (for the Leafs this would be Phaneuf  and Liles/Gleason last year) and start your strong offensive and weak against the rush defensemen (i.e. Rielly, Gardiner in particular) in the defensive zone . This is the opposite of what the Leafs did last season and generally opposite of what most normally consider doing. It makes sense though. When you are in your own zone you want defensemen who can get the puck and get it out and when you are in the oppositions zone you want defensemen who don’t give up high quality (often odd-man) rushes against. Defense should start in the offensive zone and offense should start in the defensive zone. The focus is generating offense on the rush and limiting the other teams ability to generate offense on the rush. It’s a bit counter-intuitive but might prove to be smart strategy.

I look forward to when the zone entry/exit tracking project gets completed and we can look at a much larger sample with more players from more teams but between that project and the rush shot data I have calculated we should gain significantly more insight into the game and how it is played. We might even come up with some new revolutionary on-ice strategies.