Oct 272011
 

There has been a fair bit of discussion going on regarding shot quality the past few weeks among the hockey stats nuts.  It started with this article about defense independent goalie rating (DIGR) in the wall street journal and several others have chimed in on the discussion so it is my turn.

Gabe Desjardins has a post today talking about his hatred of shot quality and how it really isn’t a significant factor and is dominated by luck and randomness.  Now, generally speaking when others use the shot quality they are mostly talking about thinks like shot distance/location, shot type, whether it was on a rebound, etc.  because that is all data that is relatively easily available or easily calculated.  When I talk shot quality I mean the overall difficulty of the shot including factors that aren’t measurable such as the circumstances (i.e. 2 on 1, one timer on a cross ice pass, goalie getting screened, etc.).  Unfortunately my definition means that shot quality isn’t easily calculated but more on that later.

In Gabe’s hatred post he dismisses pretty much everything related to shot quality in one get to the point paragraph.

 

Alan’s initial observation – the likelihood of a shot going in vs a shooter’s distance from the net – is a good one.  As are adjustments for shot type and rebounds.  But it turned out there wasn’t much else there.  Why?  The indispensable JLikens explained why – he put an upper bound on what we could hope to learn from “shot quality” and showed that save percentage was dominated by luck.  The similarly indispensable Vic Ferrari coined the stat “PDO” – simply the sum of shooting percentage and save percentage – and showed that it was almost entirely luck.  Vic also showed that individual shooting percentage also regressed very heavily toward a player’s career averages.  An exhaustive search of players whose shooting percentage vastly exceeded their expected shooting percentage given where they shot from turned up one winner: Ilya Kovalchuk…Who proceeded to shoot horribly for the worst-shooting team in recent memory last season.

So, what Gabe is suggesting is that players have little or no ability to generate goals aside from their ability to generate shots.  Those who follow me know that I disagree.  The problem with a lot of shot quality and shooting percentage studies is that sample sizes aren’t sufficient to draw conclusions at a high confidence level.  Ilya Kovalchuk may be the only one that we can say is a better shooter than the average NHLer with a high degree of confidence, but it doesn’t mean he is the only one who is an above average shooter.  It’s just that we can’t say that about the others at a statistically significant degree of confidence.

Part of the problem is that goals are very rare events.  A 30 goal scorer is a pretty good player but 30 events is an extremely small sample size to draw any conclusions over.  Making matters worse, of the hundreds of players in the NHL only a small portion of them reach the 30 goal plateau.  The majority would be in the 10-30 goal range and I don’t care how you do your study, you won’t be able to say much of anything at a high confidence level about a 15 goal scorer.

The thing is though, just because you cannot say something at a high confidence level doesn’t mean it doesn’t exist.  What we need to do is find ways of increasing the sample size to increase our confidence levels.  One way I have done that is to use 4 years of day and instead of using individual shooting percentage I use on-ice shooting percentage (this is useful in identifying players who might be good passers and have the ability to improve their linemates shooting percentage).  Just take the list of forwards sorted by on-ice 5v5 shooting percentage over the past 4 seasons.  The top of that list is dominated by players we know to be good offensive players and the bottom of the list is dominated by third line defensive role players.  If shooting percentage were indeed random we would expect some Moen and Pahlsson types to be intermingled with the Sedin’s and Crosby’s, but generally speaking they are not.

A year ago Tom Awad did a series of posts at Hockey Prospectus on “What Makes Good Players Good.”  In the first post of that series he grouped forwards according to their even strength ice time.  Coaches are going to play the good players more than the not so good players so this seems like a pretty legitimate way of stratifying the players.  Tom came up with four tiers with the first tier of players being identified as the good players.  The first tier of players contained 83 players.  It will be much easier to draw conclusions at a high confidence level about a group of 83 players than we can about single players.  Tom’s conclusions are the following:

The unmistakable conclusions from this table? Outshooting, out-qualitying and out-finishing all contribute to why Good Players dominate their opponents. Shot Quality only represents a small fraction of this advantage; outshooting and outfinishing are the largest contributors to good players’ +/-. This means that judging players uniquely by Corsi or Delta will be flawed: some good players are good puck controllers but poor finishers (Ryan Clowe, Scott Gomez), while others are good finishers but poor puck controllers (Ilya Kovalchuk, Nathan Horton). Needless to say, some will excel at both (Alexander Ovechkin, Daniel Sedin, Corey Perry). This is not to bash Corsi and Delta: puck possession remains a fundamental skill for winning hockey games. It’s just not the only skill.

In that paragraph “shot quality” and “out-qualitying” is used to reference a shot quality model that incorporates things like shot location, out-finishing is essentially shooting percentage, and outshooting is self-explanatory.  Tom’s conclusion is that the ability to generate shots from more difficult locations is a minor factor in being a better player but both being able to take more shots and being able to capitalize on those shots is of far greater importance.

In the final table in his post he identifies the variation in +/- due to the three factors.  This is a very telling table because it tells it gives us an indication of how much each factors into scoring goals.  The following is the difference in +/- between the top tier of players and the bottom tier of players:

  • +/- due to Finishing:  0.42
  • +/- due to shot quality:  0.08
  • +/- due to out shooting:  0.30

In percentages, finishing ability accounted for 52.5% of the difference, out shooting 37.5% of the difference and shot quality 10% of the difference.  Just because we can’t identify individual player shooting ability at a high confidence level doesn’t mean it doesn’t exist.

If we use the above as a guide, it is fair to suggest that scoring goals is ~40% shot generation and ~60% the ability to capitalize on those shots (either through shot location or better shooting percentages from those locations).  Shooting percentage matters and matters a lot.  It’s just a talent that is difficult to identify.

A while back I showed that goal rates are better than corsi rates in evaluating players.  In that study I showed that with just 1 season of data goal for rates will predict future goal for rates just as good as fenwick for rates can predict future goal for rates and with 2 years of data goal for rates significantly surpass fenwick for rates in terms of predictability.  I also showed that defensively, fenwick against rates are very poor predictors of future goal against rates (to the point of uselessness) while goals against rates were far better predictors of future goal against rates, even at the single season level.

The Conclusion:  There simply is no reliable way of evaluating a player statistically at even a marginally high confidence level using just a single year of data.  Our choices are either performing a Corsi analysis and doing a good job at predicting 40% of the game or performing a goal based analysis and doing a poor job at predicting 100% of the game.  Either way we end up with a fairly unreliable player evaluation.  Using more data won’t improve a corsi based analysis because sample sizes aren’t the problem, but using more data can significantly improve a goal based analysis.  This is why I cringe when I see people performing a corsi based evaluation of players.  It’s just not, and never will be, a good way of evaluating players.

 

  21 Responses to “Some Thoughts on Shot Quality”

  1.  

    Some creative excerpting on your part. Don’t forget to include this line that I also penned in the same piece:

    “Now I’m not rejecting finishing and defensive/goaltending talent – they all exist.”

    ba dum bum…

    Ok, now you can go back to distorting what I wrote.

    •  

      Ahh, there you go. Covering your butt with one sentence after 10 saying pretty much the opposite.

      My beef isn’t that you aren’t willing to admit they exist, but that you keep insisting that it is so small that it is nearly irrelevant and any analysis of it is not “going to return much for all of the effort”. 60% is not irrelevant in my opinion.

  2.  

    Do you have any thoughts on Brendan Morrison? Particularly on the failure of your article Brendan Morrison and the failure of Corsi.

    It seems Corsi got it right. Your method fails here. This was the example you cherrypicked to try to make your case and its a complete failure so far this season.

    •  

      So I show 4 years of evidence supporting my claim and you want to throw that out the window based on 8 games played? Talk about drawing conclusions based on small sample size. 8 games doesn’t tell us squat. If I drew conclusions based on 8 games Kessel would score 92 goals this year and the Blue Jackets wouldn’t win a game all season.

      I have been on vacation the past few weeks so haven’t looked at stats at all but a quick look I see that Morrison is one of the few Flames forwards who doesn’t have a negative +/-. Not all that bad for a guy who came back too early from off season knee surgery. He might be out for a couple of months now so we may never know if your 8 game theory holds up, but if it makes you happy, you can stick to your 8 game theory.

  3.  

    You are missing the point when you try to hide behind sample size.

    If I wrote that Phil Kessel wasn’t very good then my analysis would have been flawed.

    You wrote that Brendan Morrison was a good signing by the Flames. He isn’t. The honest reaction would be to admit that and look into where you failed in your analysis. Should a good signing really have no points in his first eight games?

    •  

      I don’t evaluate anyone based on 8 games while the player is playing injured. That is just wrong. I stand by my previous evaluation of Brendan Morrison. If you want to argue otherwise you better have something better than ‘he has zero points in his 8 games so far so therefore your analysis is all wrong and he is indeed a bad signing.’

  4.  

    I have plenty of times criticized your methodology. You know that and so do I.

    You cherrypicked Brendan Morrison as an example of a player who would show your case to be right and the more mainstream Corsi analysis wrong. It was your selection. So far it looks like onec of the most horrible picks you could make. So much for your methodology.

    •  

      I didn’t cherry pick him, I used him as an example. I have used several others too.

      Once again. Using 8 games to counter 4 years of data just shows how little you know about statistical analysis. But if it makes you happy, good on you.

    •  

      If you are so confident that my player evaluation method is completely wrong, why not write your best, most thorough counter argument and none of this “Brendan Morrison hasn’t scored a point in 8 games, your system sucks” nonsense.

  5.  

    David I have told you why you are wrong repeatedly. Its in your own comments – read them for more detail. You make a systematic error treating goals scored and shooting percentage as independent variables when they are clearly linked.

    You pick players who have had high shooting percentages in the past. Very few players are significantly enough different from average in shooting percentage to drive their scoring from a high shooting percentage unless they get a lot of shots. It is far more repeatable to look at the number of shots (Corsi). When you pick out a player like Morrison who has had a high shooting percentage and a low number of shots he is most likely in an unsustainable position. That is why the examples you cherrypick are more likely to fail when they disagree with the mainstream Corsi position. Morrison is not likely to not score all season or anything, but he is clearly a player who was a good pick to drop off offensively because he isn’t getting enough shots.

    When you pick a player who has a high shooting percentage and a low Corsi, you pick a player who will see an offensive drop. You being a contrarian pick these players as examples to prove your analysis and as we are seeing by the example that you chose to write about, it fails.

    •  

      That’s not proof, that is a theory. Now prove it.

      I have shown that shooting percentage is a repeatable talent, if you have enough data to identify it.

      I have shown that goal rates are a better predictor of future goal rates than corsi rates are. The more data you get (beyond 1 year), the better predictor goal rates are. The more corsi data you get (beyond one year), corsi doesn’t become a better predictor, and in some cases becomes worse.

      I have shown that shooting percentage is better correlated with with goals scored than corsi rates are.

      Take a look at the top corsi for guys and take a look at the top shooting percentage guys over the past 3 or 4 years and tell me that the corsi list is more representative of the top offensive players.

      Don’t believe me? Go read Tom Awad’s articles on what makes good players good. He identifies shooting percentage as being a critical component of outscoring your opposition. At least as important as corsi, maybe more important.

      The problem with so many anti-shooting percentage studies is they use far too small a sample size to be able to identify shooting percentage as a talent. But if you increase the sample size either by using more years as I have done or by grouping similar players as Tom Awad has done you begin to see that shooting percentage is indeed a talent. It isn’t luck that Gaborik has an on-ice shooting percentage about double that of Travis Moen over the past 4 seasons.

  6.  

    It isn’t luck that Gaborik has an on-ice shooting percentage about double that of Travis Moen over the past 4 seasons.

    While we agree on this point, it IS luck that Brendan Morrison had a better than average shooting percentage.

    A shooting percentage is very dependent upon context. A player playing in offensive situations will get better shots than one in defensive situations. That is why Gaborik has a better shooting percentage than Moen. And that is why Morrison has no business being called a good signing.

    •  

      So you agree, shooting percentage matters and players can drive shooting percentage (that’s a step forward at least). Just not for players who you don’t think it matters for, such as Morrison. Can you provide me with the complete list of players it doesn’t matter for so I will be sure not to offend you with future articles?

  7.  

    David that responce was ridiculous. The vast majority of players have very little ability to drive things repeatedly based on shooting percentage. Much of what you see when you chart shooting percentage over time is a proxy for how the player in question is used. A defensive forward like Moen will not get into the same offensive situations Gaborik will. That is most of the signal you are “measuring” with your methodology.

    •  

      “The vast majority of players have very little ability to drive things repeatedly based on shooting percentage.”

      I am not convinced that that is any more true for shooting percentage than it is for corsi. The degree of variation seen in shooting percentage is greater than the degree of variation in fenwick for rates. The standard deviation on shooting percentage is 13% of the mean while the standard deviation on fenwick for per 20 minutes is 8% of the mean. The top shooting percentage guy is significantly more than double the worst shooting percentage guy while the top FF20 guy isn’t even 60% higher than the worst FF20 guy.

      But regardless, what you are telling me is that you are OK with an evaluation system (corsi) that significantly under values the (offensive) contribution of guys like Gaborik and significantly over values the (offensive) contribution of guys like Moen. You may be OK with that, I am not.

  8.  

    If you look at the context of when Gaborik or Moen are playing then nothing is overrated or underrated using a Corsi based method. Since you don’t when you look at shooting percentage. You think it is largely talent and not the context of when they play that determines their shooting percentage difference. As a result you get things wrong. Like your Brendan Morrison prediction.

    •  

      Please do a complete corsi analysis for Gaborik for me and give me an indication of where he ranks as an offensive player in the NHL. Now do the same for Mikael Samuelsson and compare the two. Consider whatever context you like but be clear with what you are doing so I, or anyone else, could apply the same methods to another player. I am just curious how you would do such an analysis/comparison for these two players because I haven’t seen you do any real player vs player comparison and would like to know how you would do this and what your results would be. Then we can compare notes to how I would do the analysis/comparison and we can let our readers decide which is the best methodology.

      Oh, and I do consider context. Sorry you missed all the teammate and opposition and zone start stats on my stats site. They aren’t just there for show, I do look at them.

  9.  

    You think you can give me busy work to somehow prove your case. I interpret that as you are too proud to admit you are wrong. Frankly its pathetic and immature as an argument method.

    You are not taking into account the context of Brendan Morrison’s shots. If you did you would see that his shooting percentage is not somethng he can replicate in all likelihood. He would need significant power play time to have any chance – and realistically Calgary won’t play him as their number one power play forward because he clearly isn’t that. The easiest way to take that into context is to assume that with exception of players who play significant power play time with no penalty kill time or penalty killers who play no power play time will have, within random error, the same shooting percentage. Its a more accurate assumption than any you use. In the cases of lots of power play time you might get a higher shooting percentage and lots of penalty kill time you will get a lower shooting percentage. Note that penalty kill and power play are proxies for defensive players and offensive players and you might find these players with other tests.

    The expectation value for the shooting percentage of almost all forwards within experimental error is the same. It is a simpler and better assumption to view them as being all the same with the caveats above. You can dig deeper with the awareness that it will be a small correction in almost all cases. Your problem is you are think the small correction is important and have wasted tons of time and effort looking into the small corrections while neglecting or downplaying the much more important signal from the shots the player takes.

    Whenever you attempt to cherrypick a case to “prove” your case it will most likely finish like Brendan Morrison has worked. You are neglecting the dominant signal (shots produced) for a small correction (shooting percentage). When you pick a player who looks good by his small correction but not by the dominant signal, the player will probably fail and it will make you look stupid. Just like Brendan Morrison is doing to you. Remember you chose Morrison as the prefect example to show your case. It is not working out for you is it?

    •  

      Clearly you:

      1. Haven’t read everything I have written and what Tom Awad has written on the subject. The evidence goes far beyond Brendan Morrison.

      2. Are infactuated with your 8 game sample from a single injured player as your prime evidence to overthrow my theory.

      3. Aren’t willing to do any work to present your player evaluation theory to the rest of the world.

      “You are neglecting the dominant signal (shots produced) for a small correction (shooting percentage).”

      You are ignoring the fact that variance in shooting percentage among NHLers is greater than variance in corsi rates. Honestly, if you don’t believe me, go read Tom Awad: http://www.puckprospectus.com/article.php?articleid=625

      He wrote:
      “The unmistakable conclusions from this table? Outshooting, out-qualitying and out-finishing all contribute to why Good Players dominate their opponents. Shot Quality only represents a small fraction of this advantage; outshooting and outfinishing are the largest contributors to good players +/-. This means that judging players uniquely by Corsi or Delta will be flawed: some good players are good puck controllers but poor finishers (Ryan Clowe, Scott Gomez), while others are good finishers but poor puck controllers (Ilya Kovalchuk, Nathan Horton). Needless to say, some will excel at both (Alexander Ovechkin, Daniel Sedin, Corey Perry). This is not to bash Corsi and Delta: puck possession remains a fundamental skill for winning hockey games. It’s just not the only skill.”

      Shot quality = shot location
      Out shooting = getting shots/corsi
      Out finishing = shooting percentage

      And if you look at the final table in the article you will see what range between good players and bad players is greater for out finishing than it is for out shooting indicating shooting percentage is a more important skill and confirming my observations. But hey, if you want to ignore ample statistical evidence and instead go with your 8 game sample size go right ahead. You’d be a fool, but that’s your right.

  10.  

    Honestly David. You are quote mining from people who DISAGREE with you. And you think you hae a point because you can find a quote to take out of context as though it agrees with you.hrow

    Tom Awad is talking about the usually small corrections to a Corsi based analysis. You throw out the Corsi analysis and start with the corrections. That is why you are wrong. Awad who you quote would disagree with you.

Sorry, the comment form is closed at this time.