## Modern hockey thought and all-encompassing player evaluation metrics

Yesterday it came across my twitter feed a paper about using regularized logistic regression in estimating player contribution in hockey. I skimmed through the article but not enough to fully understand that article but found some of the conclusions at least mildly interesting. This post is neither a post in support or against the paper but rather a rebuttal to a rebuttal from Eric T at NHLNumbers.com.

To summarize the paper, the authors conducted a goal based analysis to estimate player contribution and to summarize Eric T’s rebuttal, Eric T applauded the effort but suggested a shot based analysis would be more appropriate because that is where ‘modern hockey thought’ currently stands.

I think my biggest concern is that by focusing exclusively on goals, you allow for shooting percentage variance to have a significant impact on a player’s calculated value. Even with four years of data, variance plays a large role in the shooting and save percentages with a given player on the ice.

…

This is why much of modern hockey analysis starts with shot-based metrics; the shooting percentages introduce a lot of variance which must be accounted for to get a reasonable assessment of talent. If you used shots for your model, I suspect you’d easily identify more than a mere 60 players who have significantly non-zero talent levels — and the model could be further refined from there (e.g. give each shot a weight based on the shooter’s career shooting percentage).

That is in essence Eric T’s argument. Shooting percentages are unreliable so it is better to use a shot based approach (though I find it a little ironic that he then suggest incorporating shooting percentage again).

The “even with four years of data, variance plays a large role in shooting and save percentages with a given player on the ice” is the statement that I have the biggest problem with. It has been shown by myself many times that goal scoring rates are a better predictor of future goal scoring than shot rates are when dealing with multiple seasons of data. Furthermore, any study that uses sufficient amounts of data (either by using multiple seasons of data or by grouping similar players and using their aggregate shooting percentage) has concluded that shot quality (ability to sustain an elevated shooting percentage) exists and is significant. For example, we know that players that get a significant amount of ice time have significantly higher shooting percentages (see here and here and here) and just by looking at list of players sorted by their long-term on-ice shooting percentages we see that good offensive players rise to the top and poor offensive players fall to the bottom (in no way can anyone conclude that that list is random in nature). There is ample evidence to suggest that with 4 years of data goal based metrics should be the preferred tool over shot/possession based metrics.

Eric T brought up Dwayne Roloson, Kent Huskins, Sean O’Donnell, and others as examples of where he feels the evaluation system failed but pointing out a few counter examples is not enough to toss the analysis out completely. There will always be exceptions and outliers when attempting to build an all-encompassing evaluation metric. For the methodology in the paper maybe it is Roloson and Huskins but I can assure you than for any shot based metric it will be Tyler Kennedy and Scott Gomez.

The standard for which an all-encompassing metric should be tested against is not “is it perfect” and if it doesn’t pass that test toss it aside and ignore it forever. These metrics will never be perfect and should never be used as the final say on a players value. In truth, they should be used to spark conversation and discussion and further investigation, not end it. When we see strange results just as much as we shouldn’t assume they are true we shouldn’t assume the whole methodology is worthless.

Furthermore, making any argument against a new methodology because it doesn’t conform to “modern hockey thought” and suggesting they revise it to make it conform more to “modern hockey thought” is plainly the worst thing one can do. The best discoveries in the history of humanity typically arise when people don’t conform to current thought processes but rather do something different. You are free to make an argument against something but make sure that argument is something deeper than “it doesn’t conform to modern hockey thought.”

Finally, my biggest beef with many in the pro corsi/possession/shot differential crowd is the way in which many immediately and abjectly dismiss anything that strays from a corsi/possession/shot differential analysis. This is as fundamentally misguided as those that claim that corsi/possession/shot differential is meaningless and goals are the only tool one should use in player evaluation. The truth is, both methods provide value. The possession method primarily provides value when dealing with small sample sizes as it will reduce small sample size and random variance issues. Shot differential metrics are inherently a flawed metric though because shot differential isn’t the end goal of the player (goal differential is what matters in the win/loss column) and shot quality and ability to drive/suppress shooting percentages exists and are real. There is nothing wrong with using possession metrics as an evaluation tool so long as we are aware of this limitation just as there is nothing wrong with using goal based metrics as an evaluation tool so long as we are aware of its sample size, randomness and uncertainty limitations. Neither are perfect, both have their uses, both have their limitations and in reality both should be considered in any player evaluation.

(Note: Just to be clear, because apparently Tyler Dellow has a poor ability to interpret words properly, my critique of Eric T’s critique of the goal based all-encompassing player evaluation metric does not in any way mean that I believe Dwayne Roloson helps his team score goals. To be completely honest, I serious question how the authors of the paper incorporate goalies into the methodology and this is supported by the fact that in my own all-encompassing player evaluation metrics – goal or shot based – I assume goalies have no influence on a teams offensive production. Hope this clears the issue up for Tyler.)

I’m not sure why you keep trying to paint me as using exclusively shots. I don’t. This is a straw man, as I’ve pointed out elsewhere. Even in the text you quoted here, I said

startwith shots.The whole point is that goals alone leaves variance as a larger factor than it needs to be, and that you need to look at shots to reduce that variance. I find it easier and more accurate to start with shots and then adjust for shooting percentages than to start with goals and calculate shooting percentages and sample sizes and regress the data accordingly, but either could be done, in principle.

You keep comparing goal rate to shot rate as a predictor. I keep telling you to do neither of the two. And you keep saying that I use exclusively shot rate and that goal rate is better. Hell, this is at least the third time you’ve quoted me as saying that both shot rate and shooting percentage are factors — and yet each time, you continue to paint me as someone who sorts by shot differential and reads the names at the top of the list.

It’s getting pretty frustrating. If you’re going to try to argue with me, I wish you’d take the time to understand what I do. Read some of my posts from the last year. Almost every single one includes a factor of shooting talent in some way.

If you believe we should start with a shot based analysis and then consider the percentages then maybe we aren’t that far apart but I just don’t understand why you are so against a goal based analysis and then considering that there may be some uncertainty in the results.

In a perfect world we’d have an infinite amount of data and in that world a goal based analysis would be the best approach because goals are what really matter in hockey and thus are what we really should be evaluating. We don’t live in a perfect world so we have to live with uncertainty. We can either live with the uncertainty of a goal based analysis or the uncertainty of a corsi based analysis (uncertainty because it isn’t as predictive of future goal scoring). So maybe what I don’t understand about you in particular is why do you believe that conducting a shot based analysis followed by a consideration of the percentages results in a more precise player evaluation than a goal based analysis.

The way I look at is on the one hand you are against conducting a goal based analysis and looking at the results and at Huskins shooting percentage and believing that Huskins is getting over valued by the goal based analysis while on the other hand you are for a shot based analysis and looking at the results and at Gomez’s shooting percentage and believing that Gomez is over valued by the shot based analysis.

I don’t understand why you are so abjectly against one and for the other?

I don’t understand why you keep insisting I’m so against it. In fact, it’s exactly what I’ve suggested you should do.

I’ll return you to the conversation we had just three weeks ago, where I said almost exactly what you’re saying here (and wondering why I don’t agree with it):

Regression to mean is worthless if everyone’s “true talent” is different. It may make some players evaluation worse. Crosby’s “true talent” on-ice shooting percentage is something north of 10% while Gomez’s is something south of 7%. f Gomez’s on-ice shooting percentage next season is 8% it probably needs to be adjusted down where as if Crosby’s is 10.0% it probably needs to be adjusted up. They have completely different confidence intervals and that is what we need to be considering. What confidence do we have in the observations.

Nobody’s saying that you should use the same on-ice sh% for everyone.

But since more extreme results are observed in small samples, you’ll be more accurate if you regress them a fraction of the way, as a function of sample size. Which is calculable and possible, but harder than just starting with shot differential and adjusting for shooting percentage, which is why I do the latter.

However, we’re now — again — into the details of exactly how to blend the two. Does that mean you’ve moved past claiming that I only use shots, or should I expect to see this happen again in a few weeks?

Yes, but the article in question was dealing with 4 years of data, hardly a small sample. It’s arguable that for many players if you start you use more you may get into career progression issues. We aren’t talking about 40 games where a player might have a 15% on-ice shooting percentage. With 4 years of data goals are easily more predictive of future goals than shots are which is why I suggest it is better to use it as a starting point.

Fine, you use more than shots. I apologize and I’ll be sure to never suggest it again.

The greater point and the main point I was trying to make is that there isn’t anything inherently wrong with using a goal based analysis and it is a perfectly reasonable (and I will argue the better) method for doing player analysis when using 4 years of data.

But it is. That’s the point. It’s not so small that we have no idea what his talent is, but there’s a sizable variance component even after four years.

At that sample size, guys have been on the ice for 300 goals (some more, some less, of course). Standard error from simple variance over a sample that size is a half a percent for on-ice shooting percentage, so there’ll be lots of guys who are 1% away from their talent and a handful who are 1.5% away. And guys who’ve been on the ice for fewer minutes (like Huskins) can be farther still.

Think of it this way: goalies get ~3x more ice time than skaters; how confident are you in your ability to tell the difference between a 92.5% goalie and a 91.5% goalie on one season’s data?

If you have a guy who’s at a 94% on-ice save percentage after 1500 minutes over four years (like Huskins), the best guess of his future on-ice save percentage is a bit above average, but nowhere near 94%. If you don’t make those regressions, you’ll consistently be giving a guy too much credit for the luck component of his results to date.

I mean, this whole discussion grew out of an article where a purely goal-based analysis loved Roloson because his team shot 0.7% better with him in net — over a period where he had something like 9000 minutes of even-strength ice time. How can you use that article as a jumping-off point for a discussion where your perspective seems to be that a skater’s 2000-4000 minute sample is large enough that we know their true talent well and variance isn’t a significant factor?

You’ve shown that we know the talent well enough that including shooting percentages improves the predictions over ignoring them altogether. But that’s not the same as showing that the shooting percentages are precisely known or that the predictions wouldn’t be improved further by making the kind of regressions I’m suggesting.

Just quickly pulling data from your own site, 129 guys played 2500+ minutes in both the ’07-10 and ’10-13 three-year chunks. The correlation between on-ice sh% in the first chunk and the second chunk is 0.33, so we’d expect the average player’s numbers to regress 67% of the way to the mean. The correlation between on-ice sv% in the first chunk and the second chunk is 0.058, so we’d expect it to regress 94.2% of the way to the mean. If you rate Kent Huskins as if he’s going to continue to stop 94% of the shots from going into his net, you’re going to really badly overrate him.

Yes, we agree that the system failed for Kent Huskins (but probably has less to do with his on-ice save % than his on-ice shooting %) but I am still not convinced that regressing to the mean inherently makes the system better. For some players such as Kent Huskins it may very well, for others it may very well not. If we did apply a regression I am certain there will still be players that are very poorly evaluated. I am pretty certain that if you regressed Crosby 94.2% to the mean his evaluation would be much worse off because I am pretty certain that Crosby’s on-ice shooting percentage is something north of 10%, not something north of 10% regressed 94.2% of the way to the mean. Would such a regression methodology make it be better overall? Maybe. I haven’t seen that proven one way or another. Feel free to develop such a model and test your theory that regressing shooting percentage 94.2% to the mean gives us something that is more predictive of future goal scoring though. I look forward to seeing the results.

It’s really tough arguing with someone who has so much trouble following simple sentences.

1) The regression for shooting percentage is 67%, not 94.2%

2) That regression is based on 2500+ minutes; we have almost 4500 minutes for Crosby and would therefore regress it significantly less than 67%

3) Crosby’s 4500+ minute on-ice sh% is 11.66%, so by drawing the line at 10%, you are regressing it almost 50% of the way to the mean yourself

So once again, you’ve listened to me lay out a position, painted my position as far more extreme than it really is, and ridiculed that straw man argument.

Please please please stop doing that. Take the time to read what I write and understand it before you criticize it.

You are nitpicking the details just like you nitpicked the details in the original paper. There are scenarios where regressing to a mean will make a player look worse than that a straight non-regresses goal based analysis. If a player with a on-ice shooting percentage true talent of 9% (certainly possible) had an observed on-ice shooting percentage of 8.5% (again certainly possible) and you regressed it any amount at all towards the mean of 8.0% your evaluation for that player gets worse. By regressing everyone to a league-wide mean you certainly will make some players evaluation worse and thus I will be able to pick out a group of players from that evaluation system and say “Ha, it doesn’t work well for players X, Y and Z.” Now, you may be right and as a whole you might have a better evaluation system but you haven’t proven that at all. Feel free to develop your own all-encompassing metric and publishing the results in your own paper and show us all how to do it better according to “modern hockey thought.” I look forward to critiquing it and pointing out all the instances where it failed.

Incidentally, this is true, of course. Regression to the mean does not mean that everyone inexorably moves toward the mean.

But it’s a well-established statistical fact (first identified in 1886) that while individuals do all sorts of crazy things, the group as a whole regresses by 1-r. Rather than demand I prove it to you, how about you read a few primers on basic use of simple statistics that describe the phenomenon.

Yes, I understand the theory and in theory you could be right. But lets not dismiss a published paper with results based solely on theory. Hockey is not a nice and simple game where, I believe anyway, traditional statistical theory can be applied and everything will necessarily work out as expected. It may, but I am not convinced we can assume it will. There is just too much going on (players changing lines, players changing teams, players changing roles, nagging injuries, coaching strategy, score effects, etc.) to assume regressing shooting percentage to the mean will necessarily produce a better all-encompassing player evaluation metric. I just think it is far more complex than that but am certainly willing to be proven wrong.

So just to be clear, here’s what happened there:

1) I say that players with 2500+ minutes will see their shooting percentage regress 67% to the mean, on average.

2) You give an estimate of Crosby’s shooting percentage that is regressed nearly 50% to the mean and say I’d be wrong to regress it 94.2% of the way to the mean.

3) I point out that Crosby’s 4500 minutes mean that I’d regress his less than 67%, not 94.2% — and that your own estimate is close to mine.

4) You say I’m nitpicking details.

If you think regressing to the mean is wrong, then argue that Crosby should be treated as an 11.6% on-ice shooter — or better, since he’s improved over the course of that six-year span. But if your own estimate of his shooting percentage is regressed by an amount similar to the amount I’m proposing, then why are you arguing with me about it? Why do I have to prove to you that it’s the right thing to do if you do it yourself?

As for the bit about how maybe well-established statistical principles don’t apply for hockey…I can’t say it’s impossible, but I think the burden of proof would be on you to show that hockey is different from medicine and sociology and psychology and economics and basketball and every other social science that deals with complex human interactions.

Also, if you are wondering why I might think you are against a goal based analysis from the link you supplied you wrote:

“This is a fundamental difference between us — you like using goal rates as a rating system, while I don’t. ”

Can’t get more explicit than that.

Eric, I admire a lot of the work you do. The zone entry stuff is really informative and in my opinion very useful and I wish there was more data to do some serious analysis on it. You are also a little more open to shooting percentage as a talent than many and maybe there is some misunderstanding of your viewpoint on my behalf or we are just coming at things from different angles and perspectives and can’t see each others view point, but I do find your attack of the paper because it focuses exclusively on goal and your somewhat implied assumption that “modern hockey thought” means you must come at a problem from a shot based analysis first and foremost unfortunate. Not just because I disagree with the sentiment (I think a goal based analysis is a perfectly good starting point if using a large enough sample size -and there is no evidence that 4 years is not large enough- and go into the analysis understanding the limitations) but because I think it is the wrong to dismiss the methodology as invalid because of that and the fact that there are some arguably glaring flaws in the results. Any all-inclusive player evaluation ill have glaring flaws in the results simply due to the nature of the problem and there is no way around it.

Boy, that’s a pretty slick truncation of that quote. What are the very next two sentences?

Gee, it’s almost like I’m saying that the problem with using goal rate is that it doesn’t account for random variance component of shooting percentage, and that if you did that it’d be almost exactly what I did.

How you equate that with someone believing that people should only look at shots and not shooting percentage, I’ll never understand.

As for the last bit, my attack on the paper is because it was based

exclusivelyon goals, which means there’s absolutely no way to separate variance out. My point is that modern hockey analysis must account for the variance of shooting percentage. I suggested starting with shots and then factoring in sh% talent as an easy way to do that, but it’s not the only way.You might even notice that I said “

muchof modern hockey analysis starts with shot based metrics” rather than “all modern hockey analysis.” This was because I was explicitly thinking of your work when I wrote that sentence, and did not want to imply that my way was the only way. But I think you’d agree that modern analysis includes an understanding that shooting percentage is variable, and that focusing exclusively on goals leaves an analyst prone to coming up with dramatically incorrect results.You keep misreading what I say as an extremist position that I don’t hold, and then criticizing me for that extremist position. It’s really frustrating.

David I’m confused why you’re making your stand based upon this paper. It’s clear, based upon the authors’ writing (It’s treatment of +/- as the major metric, it’s claims that players like Colten Orr are super valuable and that players like Crosby and Malkin are much less valuable, the use of their +/- replacement on GOALIES) that the authors did not understand hockey as a sport and just sought to improve +/- without thinking of the results by using advanced math. In short they blindly applied a tool to hockey without thinking about whether doing so in this fashion made sense.

this is not the article to defend in the argument about the value of goal based metrics.

I don’t agree with some of the stuff I read when I scanned through the paper, but just because I don’t agree with the paper doesn’t mean I have to agree with every critique of it. There are some valid reasons to be critical of the paper, as you point out and Eric did too in the second half of his critique letter. I simply object to the idea that the main reason to critique it is that it is goal based which does not conform to “modern hockey thought” as if we know so much about the game of hockey and hockey analytics that we are at a point where there is a ‘right way’ to view things and everything else can be considered an ‘invalid way’ especially when, to my knowledge, there hasn’t been one attempt to validate the so called ‘right way’ to build an all-encompassing player evaluation metric using 4 seasons of data.

You dropped the word “exclusively”. The critique I gave was that it was EXCLUSIVELY goal-based. And that modern hockey analysis requires acknowledging that there is sizable variance in goal-scoring and accounting for that in some way. As you do yourself, when it suits you.

Fair enough, and when you develop such a hybrid method I look forward to reviewing your results.

More on regression to the mean and on-ice shooting percentage, should anyone be interested.

nice work eric

I should create a site where I knowingly post incorrect analysis and then tell other people to prove me wrong.

The ironic thing is this whole debate is how to best deal with the existence of shot quality.

Don’t delete that comment. Sack up.