Apr 012014
 

Last week Tyler Dellow had a post titled “Two Graphs and 480 Words That Will Convince You On Corsi%” in which, you can say, I was less than convinced (read the comments). This post is my rebuttal that will attempt to convince you on the importance of Sh% in player evaluation.

The problem with shooting percentage is that it suffers from small sample size issues. Over small sample sizes it often gets dominated by randomness (I prefer the term randomness to luck) but the question I have always had is, if we remove randomness from the equation, how important of a skill is shooting percentage? To attempt to answer this I will look at the variance in on-ice shooting percentages among forwards as we increase the sample size from a single season (minimum 500 minutes ice time) to 6 seasons (minimum 3000 minutes ice time). As the sample size increases we would expect the variance due to randomness to decrease. This means, when the observed variance stops decreasing (or significantly slows the rate of decrease) as sample size increases we know we are approaching the point where any variance is actually variance in true talent and not small sample size randomness. So, without going on any further I present you my first chart of on-ice shooting percentages for forwards in 5v5 situations.

 

ShPctVarianceBySampleSize

Variance decline pretty much stops by the time you reach 5 years/2500+ minutes worth of data but after 3 years (1500+ minutes) the drop off rate falls off significantly. It is also worth noting that some of the drop off over longer periods of time is due to age progression/regression and not due to reduction in randomness.

What is the significance of all of this?  Well, at 5 years a 90th percentile player would have 45% more goals given an equal number of shots as a 10th percentile player. A player one standard deviation above average will have 33% more goals for given an equal number of shots as a player one standard deviation below average.

Now, let’s compare this to the same chart for CF/20 to get an idea of how shot generation varies across players.

CF20VarianceBySampleSize

It’s a little interesting that the top players show no regression over time but the bottom line players do. This may be because terrible shot generating players don’t stick around long enough. More importantly though is the magnitude of the difference between the top players and the bottom players.  Well, a 90th percentile CF20 player produces about 25% more shots attempts than a 10th percentile player and a one standard deviation above average CF20 player produces about 18.5% more than a one standard deviation below average CF20 player (over 5 years). Both of these are well below (almost half of) the 45% and 33% we saw for shooting percentage.

I hear a lot of ‘I told you so’ from the pro-corsi crowd in regards to the Leafs and their losing streak and yes, their percentages have regress this season but I think it is worth noting that the Leafs are still an example of a team where CF% is not a good indicator of performance. The Leafs 5v5close CF% is 42.5% but their 5v5close GF% is 47.6%. The idea that CF% and GF% are “tightly intertwined” as Tyler Dellow wrote is not supported by the Maple Leafs this season despite the fact that the Maple Leafs are the latest “pro-Corsi” crowds favourite “I told you so” team.

There is also some evidence that the Leafs have been “unlucky” this year. Their 5v5close shooting percentages over the past 3 seasons have been 8.82 (2nd), 8.59(4th), 10.54(1st) while this year it has dropped to 8.17 (8th). Now the question is how much of that is luck and how much is the loss of Grabovski and MacArthur and the addition of Clarkson (who is a generally poor on-ice Sh% player) but the Leafs Sh% is well below the past few seasons and some of that may be bad luck (and notably, not “regression” from years of “good luck”).

In summary, generating shots matter, but capitalizing on them matters as much or more.

 

Aug 022013
 

In Rob Vollman’s Hockey Abstract book he talks about the persistence and its importance when it comes to a particular statistics having value in hockey analytics.

For something to qualify as the key to winning, two things are required: (1) a close statistical correlation with winning percentage and (2) statistical persistence from one season to another.

More generally, persistence is a prerequisite for being able to call something a talent or a skill and how close it correlates with winning or some other positive outcome (such as scoring goals) tells us how much value that skill has.

Let’s look at persistence first. The easiest way to measure persistence is to look at the correlation of that statistics over some chunk of time vs some future chunk of time. For example, how well does a stat from last season correlate with the same stat this season (i.e. year over year correlation). For some statistics such as shooting percentages it may even be necessary to go with even larger sample sizes such as 3 year shooting percentage vs future 3 year shooting percentages.

One mistake that many people make when doing this is conclude that the lack of correlation and thus lack of persistence means that the statistics is not a repeatable skill and thus, essentially, random. The thing is, the method for how we measure persistence can be a major factor in how well we can measure persistence and how well we can measure true randomness. Let’s take two methods for measuring persistence:

  1.  Three year vs three year correlation, or more precisely the correlation between 2007-10 and 2010-13.
  2.  Even vs odd seconds over the course of 6 seasons, or the statistic during every even second vs the statistic during every odd second.

Both methods split the data roughly in half so we are doing a half the data vs half the data comparison and I am going to do this for offensive statistics for forwards with at least 1000 minutes of 5v5 ice time in each half. I am using 6 years of data so we get large sample sizes for shooting percentage calculations. Here are the correlations we get.

Comparison 0710 vs 1013 Even vs Odd Difference
GF20 vs GF20 0.61 0.89 0.28
FF20 vs FF20 0.62 0.97 0.35
FSh% vs FSh% 0.51 0.73 0.22

GF20 is Goals for per 20 minutes of ice time. FF20 is fenwick for (shots + missed shots) per 20 minutes of ice time. FSh% is Fenwick Shooting Percentage or goals/fenwick.

We can see that the level of persistence we identify is much greater when looking at even vs odd minute correlation than when looking at 3 year vs 3 year correlation. A different test of persistence gives us significantly different results. The reason for this is that there are a lot of other factors that come into play when looking at 3 year vs 3 year correlations than even vs odd correlations. In the even vs odd correlations factors such as quality of team mates, quality of competition, zone starts, coaching tactics, etc. are non-factors because they should be almost exactly the same in the even minutes as the odd minutes. This is not true for the 3 year vs 3 year correlation. The difference between the two methods is roughly the amount of the correlation that can be attributed to those other factors. True randomness, and thus true lack of persistence, is essentially the difference between 1.00 and the even vs odd correlation. This equates to 0.11 for GF20, 0.03 for FF20 and 0.27 for FSh%.

Now, lets look at how well they correlate with a positive outcome, scoring goals. But instead of just looking at that lets combine it with persistence by looking at how well predict ‘other half’ goal scoring.

Comparison 0710 vs 1013 Even vs Odd Difference
FF20 vs GF20 0.54 0.86 0.33
GF20 vs FF20 0.44 0.86 0.42
FSh% vs GF20 0.48 0.76 0.28
GF20 vs FSh% 0.57 0.77 0.20

As you can see, both FF20 and FSh% are very highly correlated with GF20 but this is far more evident when looking at even vs odd than when looking at 3 year vs 3 year correlations. FF20 is more predictive of ‘other half’ GF20 but not significantly so but this is likely solely due to the greater randomness of FSh% (due to sample size constraints) since FSh% is more correlated with GF20 than FF20 is. The correlation between even FF20 and even GF20 is 0.75 while the correlation between even FSh% and even GF20 is 0.90.

What is also interesting to note is that even vs odd provides greater benefit for identifying FF20 value and persistence than for FSh%. What this tells us is that the skills related to FF20 are not as persistent over time as the skills related to FSh%. I have seen this before. I think what this means is that GMs are valuing shooting percentage players more than fenwick players and thus are more likely to maintain a core of shooting percentage players on their team while letting fenwick players walk. Eric T. found that teams reward players for high shooting percentage more than high corsi so this is likely the reason we are seeing this.

Now, let’s take a look at how well FF20 correlates with FSh%.

Comparison 0710 vs 1013 Even vs Odd Difference
FF20 vs FSh% 0.38 0.66 0.28
FSh% vs FF20 0.22 0.63 0.42

It is interesting to note that fenwick rates are highly correlated with shooting percentages especially when looking at the even vs odd data. What this tells us is that the skills that a player needs to generate a lot of scoring chances are a similar set of skills required to generate high quality scoring chances. Skills like good passing, puck control, quickness can lead to better puck possession and thus more shots but those same skills can also result in scoring at a higher rate on those chances. We know that this isn’t true for all players (see Scott Gomez) but generally speaking players that are good at controlling the puck are good at putting the puck in the net too.

Finally, let’s look at one more set of correlations. When looking at the the above correlations for players with >1000 minutes in each ‘half’ of the data there are a lot of players that have significantly more than 1000 minutes and thus their ‘stats’ are more reliable. In any given year a top line forward will get 1000+ minutes of 5v5 ice time (there were 125 such players in 2011-12) but generally less than 1300 minutes (only 5 players had more than 1300 minutes in 2010-11). So, I took all the players that had more than 1000 even and odd minutes over the course of the past 6 seasons but only those that had fewer than 2600 minutes in total. In essense, I took all the players that have between 1000 and 1300 even and odd minutes over the past 6 seasons. From this group of forwards I calculated the same correlations as above and the results should tell us approximately how reliable (predictive) one seasons worth of data is for a front line forward assuming they played in exactly the same situation the following season.

Comparison Even vs odd
GF20 vs GF20 0.82
FF20 vs FF20 0.93
FSh% vs FSh% 0.63
FF20 vs GF20 0.74
GF20 vs FF20 0.77
FSh% vs GF20 0.65
GF20 vs FSh% 0.66
FF20 vs FSh% 0.45
FSh% vs FF20 0.40

It should be noted that because of the way in which I selected the players (limited ice time over past 6 seasons) to be included in this calculation there is an abundance of 3rd liners with a few players that reached retirement (i.e. Sundin) and young players (i.e. Henrique, Landenskog) mixed in. It would have been better to take the first 2600 minutes of each player and do even/odd on that but I am too lazy to try and calculate that data so the above is the best we have. There is far less diversity in the list of players used than the NHL in general so it is likely that for any particular player with between 1000 and 1300 minutes of ice time the correlations are stronger.

So, what does the above tell us? Once you factor out year over year changes in QoT, QoC, zone starts, coaching tactics, etc.  GF20, FF20 and FSh% are all pretty highly persistent with just one years worth of data for a top line player. I think this is far more persistent, especially for FSh%, than most assume. The challenge is being able to isolate and properly account for changes in QoT, QoC, zone starts, coaching tactics, etc. This, in my opinion, is where the greatest challenge in hockey analytics lies. We need better methods for isolating individual contribution, adjusting for QoT, QoC, usage, etc. Whether that comes from better statistics or better analytical techniques or some combination of the two only time will tell but in theory at least there should be a lot more reliable information within a single years worth of data than we are currently able to make use of.

 

Jun 112013
 

Nathan Horton has been one of the stars of these NHL playoffs as will be an integral component of the Stanley Cup finals if the Bruins are going to beat the Chicago Blackhawks. Nathan Horton is also set to become an unrestricted free agent this summer so his good playoff performance is good timing. One of the things I have noticed about Horton while looking through the statistics is that he has one of the highest on-ice 5v5 shooting percentages over the past 6 seasons of any NHL forward (ranks 16th among forwards with >300 minutes of ice time).

Part of the reason for this is that he is a fairly good shooter himself (ranks 30th with a 5v5 shooting percentage of 12.25%) but this in no way is the main reason.  Let’s take a look at how Horton’s line mates shooting percentage have been over the past 6 seasons when playing with Horton and when not playing with Horton.

Sh% w/o Horton Sh% w/ Horton Difference
Weiss 11.28% 12.84% 1.56%
Lucic 13.03% 16.98% 3.95%
Krejci 11.41% 12.10% 0.68%
Booth 8.44% 11.26% 2.82%
Frolik 6.58% 10.84% 4.26%
Stillman 10.03% 15.38% 5.35%
Zednik 8.81% 13.56% 4.75%
Average 9.94% 13.28% 3.34%

Included are all forwards Horton has played at least 400 minutes of 5v5 ice time with over the past 6 seasons along with their individual shooting percentage when with Horton and when not with Horton. Every single one of them has an individual shooting percentage higher with Horton than when not with Horton and generally speaking significantly higher.  I have previously looked at how much players can influence their line mates shooting percentages and found that Horton was among the league leaders so the above table agrees with that assessment.

It is still possible that Horton is just really lucky but that argument starts to lose steam when it seems he is getting lucky each and every year over the past 6 years (he has never had a 5v5 on-ice shooting percentage at or below league average). Whatever Horton is doing while on the ice seems to be allowing his line mates to boost their own individual shooting percentages and the result of this is that he has the 9th highest on-ice goals for rate over the past 6 seasons. He is a massively under rated player and is this summers Alexander Semin of the UFA market.

 

Apr 232013
 

With the win over the Ottawa Senators on Saturday night the Leafs have made the playoffs for the first time since the 2003-04 season and they are doing it largely on the backs of an elevated shooting percentage which currently sits at a lofty 10.52% (5v5 only). Here are all the teams with a 5v5 shooting percentage above 9.00% since 2007-08 season and how they have done in the playoffs.

Season Team 5v5 Sh% Playoff Result
2012-13 Maple Leafs 10.52 Made playoffs
2012-13 Stars 10.04 Fighting for playoff spot (10th)
2011-12 Lightning 9.73 Missed Playoffs
2009-10 Capitals 10.39 Lost in first round
2009-10 Canucks 9.14 Lost in second round
2008-09 Penguins 9.76 Won Stanley Cup
2008-09 Canucks 9.23 Lost in second round
2008-09 Bruins 9.15 Lost in second round
2008-09 Thrashers 9.02 Missed Playoffs
2007-08 Senators 9.03 Lost in first round

Prior to this season there have been 8 teams with a shooting percentage above 9.00%, 2 missed the playoffs, 2 lost in the first round, 3 lost in the second round and one team won the Stanley Cup. That isn’t very much success at all which is not a good sign for Leaf fans (myself included) hoping their team can go on a playoff run.

 

Apr 122013
 

The Toronto Maple Leafs shooting percentage has been predicted to fall for a couple of months now but it has held steady. I know that about 5-6 weeks ago the Leafs 5v5 shooting percentage was at 10.4% and I predicted it was sure to fall but as of this morning their 5v5 shooting percentage is even higher at 10.59%. Here is a graph of their 5v5 shooting percentage through out the season.

Toronto Maple Leafs 2012-13 Shooting %

Toronto Maple Leafs 2012-13 Shooting % (shots across x-axis)

League average 5v5 shooting percentage is normally just shy of 8% and the Leafs are about 33% higher than that which is incredibly high. Over the previous 5 seasons only one team has maintained a 5v5 shooting percentage above 10% over the course of an 82 game season and that was the Washington Capitals in 2009-10 when they shot at a 10.39% clip and only a handful of teams have managed to post a 5v5 shooting percentage above 9%. What the Leafs are doing is quite extraordinary even if it is a shortened season. Only 13.4% of the running 50 shot shooting percentage data points in the above graph fall below the typical league average of 8% so about 86.6% of the time they are at or above average in shooting percentage.

The only other team with a 5v5 shooting percentage above 10% this season is the Tampa Bay Lighting but they have been falling back a bit lately and in danger of falling below the 10% line as they currently sit at 10.01%.

Barring a collapse the Leafs should almost certainly end the season with a shooting percentage above 10% but it is difficult to know how much of it is luck/circumstance/randomness and how much is truly skill.

 

Feb 272013
 

The last several days I have been playing around a fair bit with team data and analyzing various metrics for their usefulness in predicting future outcomes and I have come across some interesting observations. Specifically, with more years of data, fenwick becomes significantly less important/valuable while goals and the percentages become more important/valuable. Let me explain.

Let’s first look at the year over year correlations in the various stats themselves.

Y1 vs Y2 Y12 vs Y34 Y123 vs Y45
FF% 0.3334 0.2447 0.1937
FF60 0.2414 0.1635 0.0976
FA60 0.3714 0.2743 0.3224
GF% 0.1891 0.2494 0.3514
GF60 0.0409 0.1468 0.1854
GA60 0.1953 0.3669 0.4476
Sh% 0.0002 0.0117 0.0047
Sv% 0.1278 0.2954 0.3350
PDO 0.0551 0.0564 0.1127
RegPts 0.2664 0.3890 0.3744

The above table shows the r^2 between past events and future events.  The Y1 vs Y2 column is the r^2 between subsequent years (i.e. 0708 vs 0809, 0809 vs 0910, 0910 vs 1011, 1011 vs 1112).  The Y12 vs Y23 is a 2 year vs 2 year r^2 (i.e. 07-09 vs 09-11 and 08-10 vs 10-12) and the Y123 vs Y45 is the 3 year vs 2 year comparison (i.e. 07-10 vs 10-12). RegPts is points earned during regulation play (using win-loss-tie point system).

As you can see, with increased sample size, the fenwick stats abilitity to predict future fenwick stats diminishes, particularly for fenwick for and fenwick %. All the other stats generally get better with increased sample size, except for shooting percentage which has no predictive power of future shooting percentage.

The increased predictive nature of the goal and percentage stats with increased sample size makes perfect sense as the increased sample size will decrease the random variability of these stats but I have no definitive explanation as to why the fenwick stats can’t maintain their predictive ability with increased sample sizes.

Let’s take a look at how well each statistic correlates with regulation points using various sample sizes.

1 year 2 year 3 year 4 year 5 year
FF% 0.3030 0.4360 0.5383 0.5541 0.5461
GF% 0.7022 0.7919 0.8354 0.8525 0.8685
Sh% 0.0672 0.0662 0.0477 0.0435 0.0529
Sv% 0.2179 0.2482 0.2515 0.2958 0.3221
PDO 0.2956 0.2913 0.2948 0.3393 0.3937
GF60 0.2505 0.3411 0.3404 0.3302 0.3226
GA60 0.4575 0.5831 0.6418 0.6721 0.6794
FF60 0.1954 0.3058 0.3655 0.4026 0.3951
FA60 0.1788 0.2638 0.3531 0.3480 0.3357

Again, the values are r^2 with regulation points.  Nothing too surprising there except maybe that team shooting percentage is so poorly correlated with winning because at the individual level it is clear that shooting percentages are highly correlated with goal scoring. It seems apparent from the table above that team save percentage is a significant factor in winning (or as my fellow Leaf fans can attest to, lack of save percentage is a significant factor in losing).

The final table I want to look at is how well a few of the stats are at predicting future regulation time point totals.

Y1 vs Y2 Y12 vs Y34 Y123 vs Y45
FF% 0.2500 0.2257 0.1622
GF% 0.2214 0.3187 0.3429
PDO 0.0256 0.0534 0.1212
RegPts 0.2664 0.3890 0.3744

The values are r^2 with future regulation point totals. Regardless of time frame used, past regulation time point totals are the best predictor of future regulation time point totals. Single season FF% is slightly better at predicting following season regulation point totals but with 2 or more years of data GF% becomes a significantly better predictor as the predictive ability of GF% improves and FF% declines. This makes sense as we earlier observed that increasing sample size improves GF% predictability of future GF% while FF% gets worse and that GF% is more highly correlated with regulation point totals than FF%.

One thing that is clear from the above tables is that defense has been far more important to winning than offense. Regardless of whether we look at GF60, FF60, or Sh% their level of importance trails their defensive counterpart (GA60, FA60 and Sv%), usually significantly. The defensive stats more highly correlate with winning and are more consistent from year to year. Defense and goaltending wins in the NHL.

What is interesting though is that this largely differs from what we see at the individual level. At the individual level there is much more variation in the offensive stats indicating individual players have more control over the offensive side of the game. This might suggest that team philosophies drive the defensive side of the game (i.e. how defensive minded the team is, the playing style, etc.) but the offensive side of the game is dominated more by the offensive skill level of the individual players. At the very least it is something worth of further investigation.

The last takeaway from this analysis is the declining predictive value of fenwick/corsi with increased sample size. I am not quite sure what to make of this. If anyone has any theories I’d be interested in hearing them. One theory I have is that fenwick rates are not a part of the average GMs player personal decisions and thus over time as players come and go any fenwick rates will begin to vary. If this is the case, then this may represent an area of value that a GM could exploit.

 

Jan 302013
 

For those familiar with my history, I have been a big proponent that there is more to the game of hockey than corsi and that players can certainly drive on-ice shooting percentage. I have not done much work at the team level, but now that I have team stats up at stats.hockeyanalysis.com I figured I’d take a look.

Since shooting percentages can vary significantly over small sample sizes, my goal was to use the largest sample size possible.  As such, I used 5 years of team data (2007-08 through 2011-12) and looked at each teams shooting and save percentages over that time. During those 5 years Vancouver led all teams in 5v5 ZS adjusted save percentage shooting at 10.69% while Columbus trailed all teams with a 8.61% shooting percentage. What’s interesting to note is the top 6 teams are Vancouver, Washington, Chicago, Philadelphia, Boston and Pittsburgh, all what we would consider the teams with the best offensive talent in the league. Meanwhile, the bottom 5 teams are Columbus, Los Angeles, Phoenix, Carolina, and Minnesota, all teams (except maybe Carolina) more associated with defensive play and a defense-first system.

As far as save percentage goes, Phoenix led the league with a 91.83% save percentage while the NY Islanders trailed with an 89.04% save percentage. The top 5 teams were Phoenix, Boston, Anaheim, Nashville, and Montreal.  The bottom 5 teams were NY Islanders, Tampa, Toronto, Chicago and Ottawa. Not surprises there.

As far as sample size goes, teams on average had 7,627 shots for (or against) over the course of the 5 years which gives us a reasonable large sample size to work with.

Now, in order to not use an extreme situation, I decided to compare the 5th best team to the 5th worst team in each category and then determine the probability that their deviations from each other are solely due to randomness.  This meant I was comparing Boston to Minnesota for shooting percentage and Montreal to Ottawa for save percentage.

TeamShootingPercentageComp

As you can see, there isn’t a lot of overlap, meaning there isn’t a large probability that luck is the reason for the difference between these two teams 5 year save percentages.  In fact, the intersecting area under the two curves amounts to just a 6.2% chance that the differences are luck driven.  That’s pretty small and the differences between the teams above Boston and below Minnesota would be greater. I think we can be fairly certain that there are statistically significant differences between teams 5 year shooting percentages and considering how much player movement and coaching changes there are over the span of 5 years it makes it that much more impressive. Single seasons differences could in theory (and probably likely are) more significant.

TeamSavePercentageComp

The save percentage chart provides even stronger evidence that there are non-luck factors at play.  The intersecting area under the curves equates to a 2.15% chance that the differences are due to luck alone. There is easily a statistically significant differences between Ottawa and Montreal’s 5 year save percentages. Long-term team save percentages are not luck driven!

So, the next question is, how much does it matter?  Well, the average team takes approximately 1500 5v5 ZS adjusted shots each season. The differences in shooting percentage between the 5th best team and the 5th worst team is 1.27% so that would equate to a difference of 19 goals per year during 5v5 ZS adjusted situations. The difference between the 5th best and 5th worst team in save percentage is 1.5% which equates to a 22.5 goal difference. These are not insignificant goal totals and they are likely driven solely by the percentages.

Now, how does this equate to differences in shot rates? If we take the team with the 5th highest shot rate and apply a league average shooting percentage and then compare it to the team with the 5th lowest shot rate we would find a difference of 17.5 goals over the course of a single season. This is slightly lower than what we saw for shooting and save percentages.

What is interesting is this (the percentages being more important than the shot rates) is not inconsistent with what we have seen at the individual level. In Tom Awad’s “What makes Good Players Good, Part I” post he identified 3 skills that good players differed from bad players. He identified the variation in +/- due to finishing as being 0.42 for finishing (shooting percentage), 0.08 for shot quality (shot location) and 0.30 for out shooting which would equate to out shooting being just 37.5% of the overall difference. I also showed that fenwick shooting percentage is more important than fenwick rates by a fairly significant margin.

Any player or team evaluation that doesn’t take into account the percentages or assumes the percentages are all luck driven is an evaluation that is not telling you the complete story.

 

Sep 032012
 

A month and a half ago Eric T at NHLNumbers.com had a good post on quantifying the impact on teammate shooting percentage.  I wanted to take a second look at the relative importance the impact on teammate shooting percentage can have because I disagreed somewhat with Eric’s conclusions.

For a very small number of elite playmakers, the ability to drive shooting percentage can be a major component of their value. For the vast majority of the league, driving possession is a more significant and more reproducible path to success.

It is my belief that it is important to consider impact on shooting percentage for more than a “very small number of elite playmakers” and I’ll attempt to show that now.

The method that Eric used to identify a players impact on shooting percentage is to compare that players line mates shooting percentages with him to their overall shooting percentage.  As noted in the comments the one flaw with this is that their overall shooting percentage is impacted by the player we are trying to evaluate which will end up underestimating the impact.  In the comments Eric re-did the analysis using a true “without you” shooting percentage and the impact of driving teammate shooting percentages was greater than initially expected but he concluded the conclusions didn’t  chance significantly.

Overall average for the top ten is a 1.2% boost (up from 0.9% in story) and 5 goals per year (up from 4.5). I don’t think this changes the conclusions appreciably.

In the minutes that a player is on the ice with one of the very best playmakers in the league, his shooting percentage will be about 1% better. For a player who gets ~150-200 shots per year and plays ~40-60% of his ice time with that top-tier playmaker, that’s less than a one-goal boost. It’s just not that big of a factor.

He also suggested that using the “without you” shooting percentage instead of the “overall shooting percentage” would probably result in “more accurate but less precise” analysis.  This is because a guy like Daniel Sedin would get very few shots when playing apart from Henrik Sedin because they rarely play apart and this small “apart” sample size might be subject to significant small sample size errors.

Continue reading »

Apr 192012
 

Prior to the season Gabe Desjardins and I had a conversation over at MC79hockey.com where I predicted several players would combine for a 5v5 on-ice shooting percentage above 10.0% while league average is just shy of 8.0%.  I documented this in a post prior to the season.  In short, I predicted the following:

  • Crosby, Gaborik, Ryan, St. Louis, H. Sedin, Toews, Heatley, Tanguay, Datsyuk, and Nathan Horton will have a combined on-ice shooting percentage above 10.0%
  • Only two of those 10 players will have an on-ice shooting percentage below 9.5%

So, how did my prediction fair?  The following table tells all.

Player GF SF SH%
SIDNEY CROSBY 31 198 15.66%
MARTIN ST._LOUIS 74 601 12.31%
ALEX TANGUAY 43 371 11.59%
MARIAN GABORIK 57 582 9.79%
JONATHAN TOEWS 51 525 9.71%
NATHAN HORTON 34 359 9.47%
HENRIK SEDIN 62 655 9.47%
BOBBY RYAN 52 552 9.42%
PAVEL DATSYUK 50 573 8.73%
DANY HEATLEY 42 611 6.87%
Totals 496 5027 9.87%

Well, technically neither of my predictions came true.  Only 5 players had on-ice shooting percentages above 9.5% and as a group they did not maintain a shooting percentage above 10.0%.  That said, my prediction wasn’t all that far off.  8 of the 10 players had an on-ice shooting percentage above 9.42% and as a group they had an on-ice shooting percentage of 9.87%.  If Crosby was healthy for most of the season or the Minnesota Wild didn’t suck so bad the group would have reached the 10.0% mark.  So, when all is said and done, while technically my predictions didn’t come perfectly true, the intent of the prediction did.  Shooting percentage is a talent, is maintainable, and can be used as a predictor of future performance.

I now have 5 years of on-ice data on stats.hockeyanalysis.com so I thought I would take a look at how sustainable shooting percentage is using that data.  To do this I took all forwards with 350 minutes of 5v5 zone start adjusted ice time in each of the past 5 years and took the first 3 years of the data (2007-08 through 2009-10) to predict the final 2 years of data (2010-11 and 2011-12).  This means we used at least 1050 minutes of data over 3 seasons to predict at least 700 minutes of data over 2 seasons.  The following chart shows the results for on-ice shooting percentage.

Clearly there is some persistence in on-ice shooting percentage.  How does this compare to something like fenwick for rates (using FF20 – Fenwick For per 20 minutes).

Ok, so FF20 seems to be more persistent, but that doesn’t take away from the fact that shooting percentage is persistent and a reasonable predictor of future shooting percentage.  (FYI, the guy out on his own in the upper left is Kyle Wellwood)

The real question is, are either of them any good at predicting future goal scoring rates (GF20 – goals for per 20 minutes) because really, goals are ultimately what matters in hockey.

Ok, so both on-ice shooting percentage and on-ice fenwick for rates are somewhat reasonable predictors of future on-ice goal for rates with a slight advantage to on-ice shooting percentage (sorry, just had to point that out).  This is not inconsistent with what I  found a year ago when I used 4 years of data to calculate 2 year vs 2 year correlations.

Of course, I would never suggest we use shooting percentage as a player evaluation tool, just as I don’t suggest we use fenwick as a player evaluation tool.  Both are sustainable, both can be used as predictors of future success, and both are true player skills, but the best predictor of future goal scoring is past goal scoring, as evidenced by the following chart.

That is pretty clear evidence that goal rates are the best predictor of future goal rates and thus, in my opinion anyway, the best player evaluation tool.  Yes, there are still sample size issues with using goal rates for less than a full seasons worth of data, but for all those players where we have multiple seasons worth of data (or at least one full season with >~750 minutes of ice time) for, using anything other than goals as your player evaluation tool will potentially lead to less reliable and less accurate player evaluations.

As for the defensive side of the game, I have not found a single reasonably good predictor of future goals against rates, regardless of whether I look at corsi, fenwick, goals, shooting percentage or anything else.  This isn’t to suggest that players can’t influence defense, because I believe they can, but rather that there are too many other factors that I haven’t figured out how to isolate and remove from the equation.  Most important is the goalie and I feel the most difficult question to answer in hockey statistics is how to separate the goalie from the defenders. Plus, I believe there are far fewer players that truly focus on defense and thus goals against is largely driven by the opposition.

Note:  I won’t make any promises but my intention is to make this my last post on the subject of sustainability of on-ice shooting percentage and the benefit of using a goal based player analysis over a corsi/fenwick based analysis.  For all those who still fail to realize goals matter more than shots or shot attempts there is nothing more I can say.  All the evidence is above or in numerous other posts here at hockeyanalysis.com.  On-ice shooting percentage is a true player talent that is both sustainable and a viable predictor of future performance at least on par with fenwick rates.  If you choose to ignore reality from this point forward, it is at your own peril.

 

Feb 222012
 

Looking at this chart, I think only Lightning fans can sympathize with the torture that Leaf fans have suffered through with regards to their goaltending, but at least the Lightning have made the playoffs a few times and even had some success.

Update:  For interest sake, here are the post lockout shooting percentages and PDO (shooting percentage + save percentage).