Correcting a post on shot quality and save percentage revisited, again

It is beginning to become a regular occurrence but someone over at Hockey Graphs has attempted to debunk a theory/stat/opinion of mine and once again failed in their procedure for doing so. This time Garret Hohl tried to debunk Sv% RelTM as a useful statistic by looking at the persistence and predictability of Sv% RelTM over time despite the fact that just a month ago I suggested that evaluation of the past and predicting the future are two different questions. The reason for this is due to the fact that lack of persistence might be due to players changing teams or changing roles. What Garret didn’t do in his analysis was control for players changing teams or players changing roles.

Let me step back here and stat my theory as it pertains to a players ability to positively (or negatively) impact his teams save percentage. My theory for this is aligned with what we see in score effects. With score effects, when defending a lead (and presumably playing more defensive hockey) teams see their save percentage rise. My theory for individual players is when players are assigned defensive roles it is likely that their style of play will result in a boost to their teams save percentage.

To test this theory we need to be able to define the role a player is playing. Specifically, are they playing a more defensive role or a more offensive role. I am going to propose two statistics for doing this.

  1. LTIndex. I haven’t talked about LTIndex much and I don’t yet have it available on Puckalytics.com but it can be used for this purpose. Lets define LeadingTOI% as the percentage of time that his team is defending the lead that he is on the ice for (players TOI in 5v5 leading situations / teams TOI in 5v5 leading situations). Essentially this is what percentage of ice time defending a lead does his coach trust the player to be on the ice for. TrailingTOI% is the same except for in 5v5 trailing situations, or what percentage of ice time the player is given when his team is playing catch up. LTIndex is calculated by taking LeadingTOI% and dividing it by TrailingTOI%. The result is any value greater than 1.00 indicates the player is given a higher percentage of his teams ice time defending a lead than playing catch up. In other words, players with an LTIndex greater than 1.00 are biased towards playing a defensive role while players with an LTIndex less than 1.00 are biased towards playing an offensive role.
  2. DZone%. This is calculated much like OZone% but with DZone faceoffs in the numerator. DZone% = DZ faceoffs / (DZ faceoffs + OZ faceoffs). Players with more defensive zone face offs can likely be considered to be given more defensive roles while players with more offensive zone face offs can be considered to be given more offensive roles. I am using DZone% so that correlations (positive or negative) will mean the same for both DZone% and LTIndex as higher numbers will imply more defensive roles for both stats.

The next thing I did was grabbed 8-year statistics for forwards in 5v5close situations (using close to remove most/all score effects). The stats I grabbed are  CF60 RelTM, CA60 RelTM, CF% RelTM, CSh% RelTM, CSv% RelTM, CF60 Rel, CA60 Rel, CF% Rel, CSh% Rel, and CSv% Rel. I am going to look at both RelTM and Rel stats just to see how they compare (in theory Rel should give stronger correlations as it is potentially a more pure comparison with players in opposite roles to them).

Here is a chart of correlations which perfectly summarize the the findings.

LTIndex_DZonePct_Correlations

To summarize the above shows that:

  • Players in more defensive roles generate fewer shots on offense while offensive players would generate more (makes sense).
  • Players in more defensive roles have a worse CF% rating while offensive players would have better CF% ratings (makes sense as they generate more shots for).
  • Players in defensive roles have a worse Corsi shooting percentage while offensive players have better shooting percentages (makes sense, we know offensive players can drive shooting percentage).
  • Players in defensive roles have a better Corsi save percentage while players in offensive roles have a negative impact on save percentage (as predicted).
  • Aside from CA/60 there is good agreement between Rel and RelTM stats though as expected Rel stats have slightly stronger correlations.
  • The correlations for CSv% is about 60% of the correlations for CSh% which means the ability to drive shooting percentage is stronger than that the ability to drive save percentage.

So I used two different methods for identifying players who play defensive roles. I ran correlations and the correlations with CF60, CF% and CSh% for both methods of defining defensive role make sense and the correlation with CSv% for both methods of defining defensive role matches with my prediction in my theory. That is strong evidence in support of the theory that players have an ability positively influence save percentage and that Sv%Rel and/or Sv%RelTM are measures of that ability.

None of this surprises me and none of it should surprise you either. We have seen it in score effects and I have provided numerous examples of players that seem to do so (and they typically are tied to players with defensive roles such as Brandon Sutter). If Hohl, or anyone else wants to debunk Sv%Rel or Sv%RelTM as a useful stat using a method surrounding predictability they must first take roles into account. If a player’s Sv%Rel is not persistent because of role changes you cannot use that against as evidence to debunk Sv%Rel as a useful stat. Remember, how to evaluate the past and how to predict the future are two different problems. Don’t mix them up.

 

This article has 8 Comments

  1. You make a fine point but you just stop when you should be continuing the analysis. What you seem to be doing is relying on the coaches ability to determine the quality suppression techniques of players. That makes sense (either technique in theory sound fine). I’d like to note though, you could define Dzone% better to greater reflect a “coaches trust”. Do it like Micah McCurdy does: true Defensive zone start percentage (only count when sent out by coach, not when he’s pushed into his own zone). Also to be fair, you haven’t shown how much changing teams should effect.

    To get back on track, you still haven’t shown persistence. Yes I know, the ~.3 correlation you’ve shown is nice. But it’s still descriptive, With sh%, coaches do a great job of determining shot quality but we still need to regress and be careful. We are no closer to knowing how much talent there is here. Your point sounds like a copout. Roles matter a lot/people change teams and you need to take that into account. Yes thank you……so please do so. You haven’t done that here to completion. How much stock do we place in on ice sv%? You haven’t answered that.

    What you should do, is roll up your sleeves and do a more in depth analysis. Attempt some predictive correlations (maybe 2 years vs, 2 years or whatever you want). Place players in buckets based on your requirements. For example: Players whose roles stay the same/change….etc. I’m not saying you haven’t made a point, but after reading this I’m no closer to knowing the the talent levels of on ice sv%. If a player has a good sv% real tm after two years…..how much of the observed is talent?

    What I imagine you’ll find (like in sh%), is that players who are deemed more “trustworthy”, will be regressed less. But how much? We don’t know until the work is actually done.

    1. Just to be clear here (I don’t think I stressed this good or clear enough). How do I know if a player with a good on ice sv% is lucky? How do I know if this is something I should rely on him doing in the future (and to what extent)?

      1. Actually never mind. Matt Cane, in his response to you, just did the type of analysis I had in mind. So……I guess just disregard these posts. Have a good night, I guess.

        1. If you think his one or two tweets on the subject is sufficient for that type of analysis I guess you can be satisfied. Let me tell you that it is not. I don’t know the exact details of what he did but I am aware of many pitfalls that would lead him to getting the results he did. I am certain he did an incomplete investigation and jumped to a conclusion too soon.

          “I ran one correlation, didn’t find a connection, must not be a repeatable talent, post a tweet, research done” is not really sufficient to prove/disprove anything in my mind. This lead to many mistakes in hockey analytics in the past such as zone starts have a huge impact on a players statistics and you must regress shooting percentage 80% to the mean so it is hardly a talent worth considering.

          1. “If you think his one or two tweets on the subject is sufficient for that type of analysis I guess you can be satisfied. Let me tell you that it is not. I don’t know the exact details of what he did but I am aware of many pitfalls that would lead him to getting the results he did. I am certain he did an incomplete investigation and jumped to a conclusion too soon.”

            Ok, so please do that analysis. You did some work Matt went a little farther on what you did. Maybe he’s wrong, but he’s more right than you are right now. Just saying he shouldn’t stop there is a copout, because you stopped earlier than him.

          2. Yes, there is more to research. There always will be so if your argument against my work is it is incomplete, sorry I’ll never be able to satisfy you.

            What I am saying though is that lack of persistence should not be sufficient to conclude lack of talent or ability. There is ample evidence to indicate defensive players can boost save percentage. Score effects, which everyone accepts, indicate that as do the relationship I showed above with two independent metrics that indicate defensive roles. Saying “Yeah but it is not predictive so don’t worry about it” is the wrong thing to do.

            I fully grant you that ability to boost save percentage is a far less identifiable talent than ability to boost shooting percentage and almost certain matters less (maybe minimally) in overall player evaluation for the majority of players. However, for me, it just seems that for those that have an unusually significant defensive role or an unusually significant sheltered offensive role it can be an important factor to consider.

      2. To assume that good on-ice save percentage is lucky you must believe that those players who the coach puts out in defensive situations are generally and consistently more lucky than those that the coach puts out in offensive situations. Generally that isn’t how luck and randomness work.

  2. Looking at your chart and reading the details about your process I was wondering how we could reconcile the unexpected results in CA60. I tend to avoid CA because it considers blocked shots to be a bad thing, but from a defensive perspective a blocked shot stays out of the net 100% of the time, so why should we consider it a bad thing? So I wanted to take a look at SA and FA and see how they compare to CA. And since I was doing that I went ahead and looked at the other derivatives (FF, FF%, FSh%, etc…)

    I don’t have access to your LTIndex, and I didn’t feel like putting in the effort to sort through hundreds of skaters to make sure the data from multiple charts aligns, but it would be interesting to see the results with that metric as well. So my rudimentary analysis was only able to look at DZone%. However I was also curious about whether or not looking at Relative DZone% could be more beneficial, so I went ahead and looked at the correlations for that as well. The numbers are all just the R² values from a simple linear correlation in Excel, so if you have a simple means of doing a different correlation perhaps the numbers would work out differently.

    #1 SF60 RelTM is a slight improvement over CF60 RelTM
    SF vs DZ = -.3023; CF vs DZ = -.2936
    Not really enough of a difference to really bother, but it is interesting to see (considering sample size we do expect SF to be as good or better than CF with this much data)

    #2 When looking at CF60 Rel there is a stronger correlation with DZone% Rel
    CF vs DZ = -.3029; CF vs DZ Rel = -.3544
    Oddly the DZone% Rel has a lower correlation with the RelTM stats (SF -.2646 and CF -.2570), so its not something that can be universally used.

    #3 SA60 Rel improves over CA60 Rel
    SA vs DZ = -.0113; CA vs DZ = .0083
    It is still a tiny correlation and could be largely irrelevant, but it does give us the negative trend we would expect from defensive specialists (i.e. better capable of preventing the opponent from getting pucks on net).

    #4 DZone% Rel provides improvements for both RelTM and Rel defensive possession stats
    SA RelTM vs DZ Rel = -.0157; FA RelTM vs DZ Rel = -.0094; CA RelTM vs DZ Rel = .0002
    SA Rel vs DZ Rel = -.0251; FA Rel vs DZ Rel = -.0148; CA Rel vs DZ Rel = .0029
    Both SA and FA have much better correlations than CA (although still incredibly small) and both give us the expected negative trend. I wasn’t expecting SA to be as much better than FA, but it does support my assumption that using CA for evaluating defense is lacking because it incorrectly assumes blocked shots are always a bad thing.

    #5 When looking at CF% Rel there is a slight benefit to using DZone% Rel
    CF% vs DZ = -.2571; CF% vs DZ Rel = -.2819

    #6 FSh% is a slight improvement over CSh%
    FSh RelTM vs DZ = -.1608; CSh RelTM vs DZ = -.1555
    FSh Rel vs DZ = -.2257; CSh Rel vs DZ = -.2214
    Like with SF vs CF way up in my first point the difference isn’t enough to need to bother, but it is still interesting to see that it is a little higher (possibly due to the fact that shooters are not entirely capable of avoiding defenders who jump into lanes to block shots).

    #7 Oddly we find that DZone% Rel improves on both the RelTM and Rel Sh%
    FSh RelTM vs DZ Rel = -.2129; CSh RelTM vs DZ Rel = -.2046
    FSh Rel vs DZ Rel = -.2769; CSh Rel vs DZ Rel = -.2705

    #8 When looking at CSv% Rel there is a slight benefit to using DZone% Rel
    CSv Rel vs DZ = .0723; CSv Rel vs DZ Rel = .0846
    It isn’t a big improvement, and is a rather small correlation either way, but it is interesting to see. Although we do see a slight decline in the RelTM vs DZ Rel (CSv RelTM vs DZ = .0617; CSv RelTM vs DZ Rel = .0611).

    I was tempted to look into those because in the past I had come to the conclusion that I prefer less traditional stats usage than is normally seen. As I mentioned I don’t consider blocked shots to be a bad thing, so I like to use FA for defensive shot suppression (although at this sample size it appears I am better off just using SA), but because every blocked shot does keep a puck out of the net I tend to use CSv% (which does appear to be the best in this analysis). I do just the opposite for offense, counting every attempt a player makes by using CF but since the shooter is not wholly responsible for the defender’s actions using FSh% (and both also appear to be the best options in this analysis). Despite the fact that Corsi isn’t the best for the defensive side of things, we still see that as a differential CF% comes out on top. I was also interested in DZ Rel because in looking at past player comparisons I noticed that better possession teams have more OZ starts, so their defensive specialists are going to see a higher percentage than those on poor possession teams since they spend so much more time in their own end. But I would like to see if any of these trends also appear when looking at LTIndex (or using something more complex than a linear correlation).

Comments are closed.