Persistence of Sv%RelTM and failure of statistical models in hockey analytics

Over the last several days I have tweeted several times (here, here and here) about my Sv%RelTM statistic which can be found on Puckalytics.com which generated some interest from my followers as well as some skeptics.

 

The issue I have with that statement and others like it is that it uses a simple statistical model, applies it to all players, and then draws conclusions about all players based on the results without actually really understanding what the model is telling us or understanding all the inherent problems with measuring players ability to impact shot quality against.

The most important factor is that goals are infrequent events and a single seasons worth of data is simply not enough to reliably measure shooting and save percentages. This means we need a larger sample size to accurately measure these effects. Unfortunately when you expand the sample size other factors come into play most of which we don’t ever control for in our statistical models. Playing style seems to be a significant impact on the percentages (and on possession statistics too). Over the course of multiple seasons players move up and down lineups and are given different roals, players change teams, coaches get changed and implement new systems, young players improve with age, older players decline, etc. These are all significant factors that come into play and are rarely if ever controlled for but they will all affect the reliability of the statistical models we apply.

Possession stats are great because they are based on shots and shots are frequently occurring events. We can with as little as a single seasons worth of data (or less) identify persistence of possession statistics. What is interesting though is that I have shown that over longer periods of time persistence in possession statistics starts to fall off (especially in shots for statistics). All those other factors that I mentioned above affect possession statistics too but we don’t care about it because we can identify possession talent using small enough sample sizes that they aren’t a factor. (In actual fact they are a factor in that possession stats may be measuring playing style more than player talent)

The reality is that to claim that players have zero talent to influence their goalies save percentage is an extraordinary claim that shouldn’t be made lightly. There are hundreds of players with differing skill sets assigned different roles and playing different styles of play and to conclude that none of those differences has an impact on goalie save percentage would be quite astonishing. In fact, this is something that Kyle Dubas recently talked about at the Sloan Conference.

“You read everything and I agree with it and its sound where Player X can’t influence the goalies save percentage in a repeatable manner or for a different goalie and so on and so forthand I can never bring myself, even though I agree with all of the data information is correct, I can’t bring myself to fully admit and accept that a defenseman can’t or a forward in a defensive posture can’t alter the course of the game defensively.” –Kyle Dubas (about the 39:00 mark in video)

So, I really wonder if grouping all players together and running some simple statistical models is sufficient to make the claim that no player can impact goalie save percentage? I don’t believe so and I don’t believe so for a number of reasons including that we know save percentage varies based on score due to how players play based on whether they are protecting a lead or playing catch up. If score effects impact save percentage there should be no doubt that different players with different roles can do so as well and thus we should be able to see this in the data.

The list below are the 2010-2013 (3yr) leaders (top 20 skaters) in Sv%RelTM among players who have had at least 1500 minutes of 5v5close ice time over that 3-year span and at least 1000 minutes of 5v5close ice time over the following 2-year span (2013-2015) which we will compare the 3-year data to.

Player_Name 2010-13 Sv% RelTM 2013-15 Sv% RelTM
DAVID BACKES 3.1 0.8
JASON GARRISON 2.6 0.7
DUSTIN BROWN 2.5 0.3
LOGAN COUTURE 2.4 2
MATT NISKANEN 2.2 0.9
STEVE OTT 2.1 1.5
TYLER SEGUIN 1.9 0
ANDREW MACDONALD 1.9 0.4
ALEXANDER SEMIN 1.8 -0.2
BRIAN CAMPBELL 1.8 0.1
MARIAN HOSSA 1.8 1.4
WILLIE MITCHELL 1.7 0.4
BRANDON SUTTER 1.7 0.8
BOBBY RYAN 1.7 0.4
FRANS NIELSEN 1.6 -0.2
DAN BOYLE 1.6 1.7
SERGEI GONCHAR 1.5 1.4
MATT MOULSON 1.4 0.3
DION PHANEUF 1.4 1.7
KEVIN SHATTENKIRK 1.3 -0.5
Average 1.9 0.695

Looking at the above table there should be no doubt that there is some level of persistence there. Of the top 20 skaters in for the 3-year period in Sv%RelTM only 3 of those players went on to post a negative Sv%RelTM in the following 2-year period and none were really that significantly negative.

Here are the top 15 players in 5v5close Sh%RelTM over the past 5 seasons combined.

Player_Name Sv% RelTM
SHAWN HORCOFF 2.5
BRYCE SALVADOR 2.5
DAVID BACKES 2.3
ERIK CONDRA 2.1
DANNY DEKEYSER 2.1
DALE WEISE 2
IAN COLE 2
LOGAN COUTURE 2
STEVE OTT 1.9
PAUL GAUSTAD 1.9
JASON GARRISON 1.8
SLAVA VOYNOV 1.7
TREVOR LEWIS 1.7
DANIEL WINNIK 1.7
CAM ATKINSON 1.7

 

What you will notice is that the majority of them are considered defensive specialists or at the very least quality 2-way players. Is this really evidence of randomness? No. There appears to be a certain type of player that rises to the top of this list. Randomness doesn’t produce this.

The ability of players to impact a goalies save percentage is real. It is difficult to reliably detect over small sample sizes and over larger sample sizes often gets washed out by other factors such as roster changes, coaching changes, changes in role, etc. and in fact there may not be a lot of players that exhibit this talent to a significant degree but it doesn’t mean it doesn’t exist. It most certainly does and failure to recognize it will lead to failures in properly evaluating players. It is also important to really take the time to understand what a statistical model is, and is not, telling us and not to draw conclusions prematurely.

——

I have written on this topic several times previously. Here are a couple of post that are worth reading. Eventually I hope I’ll be able to stop writing articles on this topic.

Defenders effect on save percentage

Why can’t players boost a goalies save percentage?