May 112014
 

I often feel that I am the sole defender of goal based hockey analyitics in a world dominated by shot attempt (corsi) based analytics. In recent weeks I have often heard the pro-corsi crowd cite example after example of where corsi-based analytics “got it right” or “predicted something fairly well”. While it is always good to be able to cite examples where you got things right a fair an honest evaluation looks at the complete picture, not just the good outcomes. Otherwise it is analytics by anecdotes which is an oxymoron if there every was one.

For example, Kent Wilson of FlamesNation.ca recent wrote about the “Dawning of the Age of Fancy Stats” in which he cited several instances of where hockey analytics got it right or did well in predicting outcomes.

The big test case which seems to have moved the needle in favour of the nerds is, of course, the Toronto Maple Leafs. Toronto came into the season with inflated expectations after an outburst of percentages during the lock-out shortened year saw them break into the post-season. Their awful underlying numbers caused the stats oriented amongst us to be far more circumspect about their chances, of course.

Toronto is the recent example that the hockey analytics crowd likes to bring up in support of their case but it is just one example. We don’t hear much about how many predicted the Ottawa Senators to be in the playoffs and some even had them challenging for the top spot in the eastern conference. We don’t hear much about how the New Jersey Devils missed the playoffs yet again despite having the 5th best 5v5close Fenwick% in the league, the year after missing the playoffs with the 3rd best 5v5close Fenwick% in the league. If we are truly interested in hockey analytics we need a complete and unbiased assessment of all outcomes, not just the ones that support our underlying belief.

In the same article Kent Wilson quoted a tweet from Dimitri Filipovic about the success of Corsi in predicting outcomes of playoff series.

Relevant #fact: since ’08 playoffs, teams that were 5+ % better than their opponent in 5v5 fenwick close during the regular season are 25-7.

While interesting, what it really doesn’t tell us a whole lot more than “when one team is significantly better at outshooting their opponents they more often than not win”. Well, that really isn’t saying a whole lot. It is more or less saying, when a dominant team plays a mediocre team, the dominant team usually wins. Not really that interesting when you think of it that way.

Here is another fact that puts that into perspective. Since the 2008 playoffs, the team with the better 5v5close Fenwick% has a 53-35-2 record (there were 2 cases where teams had identical fenwick% to 1 decimal place). That actually makes it sound like 5v5close Fenwick% is predictive overall, not just in cases where one team is significantly better than another. Of course, if we look at goals we find that the team with the better 5v5close goal% has a 54-34-1 record. In other words, 5v5close possession stats did no better at predicting playoff outcomes than 5v5close goal stats. It is easy to throw out stats that support a point of view, but it is far more important to look at the complete picture. That is what analytics is about.

A similar statistic was promoted by Michael Parkatti in a recent talk on hockey analytics at the University of Alberta. In that talk Parkatti stated that of the last 15 Stanley Cup winners all but 3 had a “ShotShare” (all situations) of at least 53%. The exceptions were Pittsburgh in 2009, Boston in 2011 and Carolina in 2006. I will note that it appears that all three of these teams are below 51% and 2009 Penguins were below 50%. That seems sort of impressive but I did some digging myself and found that every Stanley Cup winner since 1980 had a “GoalShare” (all situations) greater than 52%. Every single one. No exceptions. I didn’t look at any cup winners pre-1980 but the trend may very well go back a lot further. As impressive as 12 of 15 is, 34 of 34 is far more impressive.

Here is the thing. We know that goal percentage correlates with winning far better than corsi percentage. This is an indisputable fact. It is actually quite a bit better. The sole reason we use corsi is that goals are infrequent events and thus not necessarily indicative of true talent due to small sample size issues. This is a fair argument and one that I accept. In situations where you have small sample sizes definitely use corsi as your predictive metric (but understand its limitations). The question that needs to be answered is what constitutes a small sample size and more importantly what sample size do we need such that goals become as good or better of a predictor of future events than corsi. I have pegged this crossing point at about 1 seasons worth of data, maybe a bit more if looking at individual players who may not be getting 20 minutes of ice time a game (my guess is around >750 minutes of ice time is where I’d start to get more comfortable using goal data than corsi data). I am certain not everyone agrees but I haven’t see a lot of analyses attempting to find this “crossing point”.

Let’s take another look at how well 5v5close Fenwick% and Goal% predict playoff outcomes again but lets look by season rather than overall.

FF% GF%
2008 7-7-1 6-9
2009 9-6 11-4
2010 9-6 11-4
2011 10-5 11-4
2012 7-7-1 7-7-1
2013 11-4 8-7
Total 53-35-2 54-35-1

In full seasons not affected by lockouts we find that GF% was generally the better predictor (only 2008 did GF% under perform FF%) but in last years lockout shortened season FF% significantly outperformed GF%. Was this a coincidence or is it evidence that 48 games is not a large enough sample size to rely on GF% more than CF% but 82 games probably is?

I have seen numerous other examples in recent weeks where “analytics” supporters have used what amounts to not much more than anecdotal evidence to support their claims. This is not analytics. Analytics is a fair, unbiased and complete fact based assessment of reality. Showing why a technique is a good predictor some of the time is not enough. You need to show why it is overall a better predictor all of the time or at least define when it is and when it isn’t.

I recently wrote an article on whether last years statistics predicted this years playoff teams and found that GF% seemed to do at least as well as CF% despite last season being a lock-out shortened year.

With all that said, you will frequently find me using “possession” statistics so I certainly don’t think they are useless. It is just my opinion that puck possession is just one aspect of the game and puck possession analytics has largely been oversold when it comes to how useful it is as a predictor. Conversely goal based analytics has been largely given a bad rap which I find a little unfortunate.

(Another article worth reading is Matt Rudnitsky’s MONEYPUCK: Why Most People Need To Shut Up About ‘Advanced Stats’ In The NHL.)

 

Aug 022013
 

In Rob Vollman’s Hockey Abstract book he talks about the persistence and its importance when it comes to a particular statistics having value in hockey analytics.

For something to qualify as the key to winning, two things are required: (1) a close statistical correlation with winning percentage and (2) statistical persistence from one season to another.

More generally, persistence is a prerequisite for being able to call something a talent or a skill and how close it correlates with winning or some other positive outcome (such as scoring goals) tells us how much value that skill has.

Let’s look at persistence first. The easiest way to measure persistence is to look at the correlation of that statistics over some chunk of time vs some future chunk of time. For example, how well does a stat from last season correlate with the same stat this season (i.e. year over year correlation). For some statistics such as shooting percentages it may even be necessary to go with even larger sample sizes such as 3 year shooting percentage vs future 3 year shooting percentages.

One mistake that many people make when doing this is conclude that the lack of correlation and thus lack of persistence means that the statistics is not a repeatable skill and thus, essentially, random. The thing is, the method for how we measure persistence can be a major factor in how well we can measure persistence and how well we can measure true randomness. Let’s take two methods for measuring persistence:

  1.  Three year vs three year correlation, or more precisely the correlation between 2007-10 and 2010-13.
  2.  Even vs odd seconds over the course of 6 seasons, or the statistic during every even second vs the statistic during every odd second.

Both methods split the data roughly in half so we are doing a half the data vs half the data comparison and I am going to do this for offensive statistics for forwards with at least 1000 minutes of 5v5 ice time in each half. I am using 6 years of data so we get large sample sizes for shooting percentage calculations. Here are the correlations we get.

Comparison 0710 vs 1013 Even vs Odd Difference
GF20 vs GF20 0.61 0.89 0.28
FF20 vs FF20 0.62 0.97 0.35
FSh% vs FSh% 0.51 0.73 0.22

GF20 is Goals for per 20 minutes of ice time. FF20 is fenwick for (shots + missed shots) per 20 minutes of ice time. FSh% is Fenwick Shooting Percentage or goals/fenwick.

We can see that the level of persistence we identify is much greater when looking at even vs odd minute correlation than when looking at 3 year vs 3 year correlation. A different test of persistence gives us significantly different results. The reason for this is that there are a lot of other factors that come into play when looking at 3 year vs 3 year correlations than even vs odd correlations. In the even vs odd correlations factors such as quality of team mates, quality of competition, zone starts, coaching tactics, etc. are non-factors because they should be almost exactly the same in the even minutes as the odd minutes. This is not true for the 3 year vs 3 year correlation. The difference between the two methods is roughly the amount of the correlation that can be attributed to those other factors. True randomness, and thus true lack of persistence, is essentially the difference between 1.00 and the even vs odd correlation. This equates to 0.11 for GF20, 0.03 for FF20 and 0.27 for FSh%.

Now, lets look at how well they correlate with a positive outcome, scoring goals. But instead of just looking at that lets combine it with persistence by looking at how well predict ‘other half’ goal scoring.

Comparison 0710 vs 1013 Even vs Odd Difference
FF20 vs GF20 0.54 0.86 0.33
GF20 vs FF20 0.44 0.86 0.42
FSh% vs GF20 0.48 0.76 0.28
GF20 vs FSh% 0.57 0.77 0.20

As you can see, both FF20 and FSh% are very highly correlated with GF20 but this is far more evident when looking at even vs odd than when looking at 3 year vs 3 year correlations. FF20 is more predictive of ‘other half’ GF20 but not significantly so but this is likely solely due to the greater randomness of FSh% (due to sample size constraints) since FSh% is more correlated with GF20 than FF20 is. The correlation between even FF20 and even GF20 is 0.75 while the correlation between even FSh% and even GF20 is 0.90.

What is also interesting to note is that even vs odd provides greater benefit for identifying FF20 value and persistence than for FSh%. What this tells us is that the skills related to FF20 are not as persistent over time as the skills related to FSh%. I have seen this before. I think what this means is that GMs are valuing shooting percentage players more than fenwick players and thus are more likely to maintain a core of shooting percentage players on their team while letting fenwick players walk. Eric T. found that teams reward players for high shooting percentage more than high corsi so this is likely the reason we are seeing this.

Now, let’s take a look at how well FF20 correlates with FSh%.

Comparison 0710 vs 1013 Even vs Odd Difference
FF20 vs FSh% 0.38 0.66 0.28
FSh% vs FF20 0.22 0.63 0.42

It is interesting to note that fenwick rates are highly correlated with shooting percentages especially when looking at the even vs odd data. What this tells us is that the skills that a player needs to generate a lot of scoring chances are a similar set of skills required to generate high quality scoring chances. Skills like good passing, puck control, quickness can lead to better puck possession and thus more shots but those same skills can also result in scoring at a higher rate on those chances. We know that this isn’t true for all players (see Scott Gomez) but generally speaking players that are good at controlling the puck are good at putting the puck in the net too.

Finally, let’s look at one more set of correlations. When looking at the the above correlations for players with >1000 minutes in each ‘half’ of the data there are a lot of players that have significantly more than 1000 minutes and thus their ‘stats’ are more reliable. In any given year a top line forward will get 1000+ minutes of 5v5 ice time (there were 125 such players in 2011-12) but generally less than 1300 minutes (only 5 players had more than 1300 minutes in 2010-11). So, I took all the players that had more than 1000 even and odd minutes over the course of the past 6 seasons but only those that had fewer than 2600 minutes in total. In essense, I took all the players that have between 1000 and 1300 even and odd minutes over the past 6 seasons. From this group of forwards I calculated the same correlations as above and the results should tell us approximately how reliable (predictive) one seasons worth of data is for a front line forward assuming they played in exactly the same situation the following season.

Comparison Even vs odd
GF20 vs GF20 0.82
FF20 vs FF20 0.93
FSh% vs FSh% 0.63
FF20 vs GF20 0.74
GF20 vs FF20 0.77
FSh% vs GF20 0.65
GF20 vs FSh% 0.66
FF20 vs FSh% 0.45
FSh% vs FF20 0.40

It should be noted that because of the way in which I selected the players (limited ice time over past 6 seasons) to be included in this calculation there is an abundance of 3rd liners with a few players that reached retirement (i.e. Sundin) and young players (i.e. Henrique, Landenskog) mixed in. It would have been better to take the first 2600 minutes of each player and do even/odd on that but I am too lazy to try and calculate that data so the above is the best we have. There is far less diversity in the list of players used than the NHL in general so it is likely that for any particular player with between 1000 and 1300 minutes of ice time the correlations are stronger.

So, what does the above tell us? Once you factor out year over year changes in QoT, QoC, zone starts, coaching tactics, etc.  GF20, FF20 and FSh% are all pretty highly persistent with just one years worth of data for a top line player. I think this is far more persistent, especially for FSh%, than most assume. The challenge is being able to isolate and properly account for changes in QoT, QoC, zone starts, coaching tactics, etc. This, in my opinion, is where the greatest challenge in hockey analytics lies. We need better methods for isolating individual contribution, adjusting for QoT, QoC, usage, etc. Whether that comes from better statistics or better analytical techniques or some combination of the two only time will tell but in theory at least there should be a lot more reliable information within a single years worth of data than we are currently able to make use of.

 

Apr 172013
 

Even though I am a proponent of shot quality and the idea that the percentages matter (shooting and save percentage) puck control and possession are still an important part of the game and the Maple Leafs are dreadful at it. One of the better easily available metrics for measuring possession is fenwick percentage (FF%) which is a measure of the percentage shot attempts (shots + shots that missed the net) that your team took. So a FF% of 52% would mean your team took 52% of the shots while the opposing team took 48% of the shots. During 5v5 situations this season the Maple Leafs have a FF% of 44.4% which is dead last in the NHL. So, who are the biggest culprits in dragging down the Maple Leafs possession game? Let’s take a look.

Forwards

Player Name FF% TMFF% OppFF% FF% – TMFF% FF%-TMFF%+OppFF%-0.5
MACARTHUR, CLARKE 0.485 0.44 0.507 0.045 0.052
KESSEL, PHIL 0.448 0.404 0.507 0.044 0.051
KOMAROV, LEO 0.475 0.439 0.508 0.036 0.044
KADRI, NAZEM 0.478 0.444 0.507 0.034 0.041
GRABOVSKI, MIKHAIL 0.45 0.424 0.508 0.026 0.034
VAN_RIEMSDYK, JAMES 0.456 0.433 0.508 0.023 0.031
FRATTIN, MATT 0.475 0.448 0.504 0.027 0.031
LUPUL, JOFFREY 0.465 0.445 0.502 0.02 0.022
BOZAK, TYLER 0.437 0.453 0.508 -0.016 -0.008
KULEMIN, NIKOLAI 0.421 0.454 0.51 -0.033 -0.023
ORR, COLTON 0.401 0.454 0.5 -0.053 -0.053
MCLAREN, FRAZER 0.388 0.443 0.501 -0.055 -0.054
MCCLEMENT, JAY 0.368 0.459 0.506 -0.091 -0.085

FF% is the players FF% when he is on the ice expressed in decimal form. TMFF% is an average of the players team mates FF% when they are not playing with the player in question (i.e. what his team mates do when they are separated from them, or a quality of teammate metric). OppFF% is an average of the players opponents FF% (i.e. a quality of competition metric). From those base stats I took FF% – TMFF% which will tell us which players perform better than their teammates do when they aren’t playing with him (the higher the better). Finally I factored in OppFF% by adding in how much above 50% their opposition is on average. This will get us an all encompassing stat to indicate who are the drags on the Leafs possession game.

Jay McClement is the Leafs greatest drag on possession. A few weeks ago I posted an article visually showing how much of a drag on possession McClement has been this year and in previous years. McClement’s 5v5 FF% over the past 6 seasons are 46.2%, 46.8%, 45.3%, 47.5%, 46,2% and 36.8% this season.

Next up are the goons, Orr and McLaren which is probably no surprise. They are more interested in looking for the next hit/fight than they are the puck. In general they are low minute players so their negative impact is somewhat mitigated but they are definite drags on possession.

Kulemin is the next biggest drag on possession which might come as a bit of a surprise considering that he has generally been fairly decent in the past. Looking at the second WOWY chart here you can see that nearly every player has a worse CF% (same as FF% but includes shots that have been blocked) with Kulemin than without except for McClement and to a much smaller extent Liles. This is dramatically different than previous seasons  (see second chart again) when the majority of players did equally well or better with Kulemin save for Grabovski. Is Kulemin having an off year? It may seem so.

Next up is my favourite whipping boy Tyler Bozak. Bozak is and has always been a drag on possession. Bozak ranks 293 of 312 forwards in FF% this season (McClement is dead last!) and in the previous 2 seasons he ranked 296th of 323 players.

Among forwards, McClement, McLaren, Orr, Kulemin and Bozak appear to be the biggest drags on the Maple Leafs possession game this season.

Defense

Player Name FF% TMFF% OppFF% FF% – TMFF% FF%-TMFF%+OppFF%-0.5
FRANSON, CODY 0.469 0.437 0.506 0.032 0.038
GARDINER, JAKE 0.463 0.44 0.506 0.023 0.029
KOSTKA, MICHAEL 0.459 0.435 0.504 0.024 0.028
GUNNARSSON, CARL 0.455 0.437 0.506 0.018 0.024
FRASER, MARK 0.461 0.445 0.506 0.016 0.022
LILES, JOHN-MICHAEL 0.445 0.443 0.503 0.002 0.005
PHANEUF, DION 0.422 0.455 0.509 -0.033 -0.024
HOLZER, KORBINIAN 0.399 0.452 0.504 -0.053 -0.049
O_BYRNE, RYAN 0.432 0.505 0.499 -0.073 -0.074

O’Byrne is a recent addition to the Leafs defense so you can’t blame the Leafs possession woes on him, but in Colorado he was a dreadful possession player so he won’t be the answer to the Leafs possession woes either.

Korbinian Holzer was dreadful in a Leaf uniform this year and we all know that so no surprise there but next up is Dion Phaneuf, the Leafs top paid and presumably best defenseman. In FF%-TMFF%+OppFF%-0.5 Phaneuf ranked a little better the previous 2 seasons (0.023 and 0.003) so it is possible that he is having an off year or had his stats dragged down a bit by Holzer but regardless, he isn’t having a great season possession wise.

 

 

Apr 112013
 

Every now and again someone asks me how I calculate HARO, HARD and HART ratings that you can find on stats.hockeyanalysis.com and it is at that point I realize that I don’t have an up to date description of how they are calculated so today I endeavor to write one.

First, let me define HARO, HARD and HART.

HARO – Hockey Analysis Rating Offense
HARD – Hockey Analysis Rating Defense
HART – Hockey Analysis Rating Total

So my goal when creating then was to create an offensive defensive and overall total rating for each and every player. Now, here is a step by step guide as to how they are calculated.

Calculate WOWY’s and AYNAY’s

The first step is to calculate WOWY’s (With Or Without You) and AYNAY’s (Against You or Not Against You). You can find goal and corsi WOWY’s and AYNAY’s on stats.hockeyanalysis.com for every player for 5v5, 5v5 ZS adjusted and 5v5 close zone start adjusted situations but I calculate them for every situation you see on stats.hockeyanalysis.com and for shots and fenwick as well but they don’t get posted because it amounts to a massive amounts of data.

(Distraction: 800 players playing against 800 other players means 640,000 data points for each TOI, GF20, GA20, SF20, SA20, FF20, FA20, CF20, CA20 when players are playing against each other and separate of each other per season and situation, or about 17.28 million data points for AYNAY’s for a single season per situation. Now consider when I do my 5 year ratings there are more like 1600 players generating more than 60 million datapoints.)

Calculate TMGF20, TMGA20, OppGF20, OppGA20

What we need the WOWY’s for is to calculate TMGF20 (a TOI with weighted average GF20 of the players teammates when his team mates are not playing with him), TMGA20 (a TOI with weighted average GA20 of the players teammates when his team mates are not playing with him), OppGF20 (a TOI against weighted average GF20 of the players opponents when his opponents are not playing against him) and OppGA20 (a TOI against weighted average GA20 of the players opponents when his opponents are not playing against him).

So, let’s take a look at Alexander Steen’s 5v5 WOWY’s for 2011-12 to look at how TMGF20 is calculated. The columns we are interested in are the Teammate when apart TOI and GF20 columns which I will call TWA_TOI and TWA_GF20. TMGF20 is simply a TWA_TOI (teammate while apart time on ice) weighted average of TWA_GF20. This gives us a good indication of how Steen’s teammates perform offensively when they are not playing with Steen.

TMGA20 is calculated the same way but using TWA_GA20 instead of TWA_GF20. OppGF20 is calculated in a similar manner except using OWA_GF20 (Opponent while apart GF20) and OWA_TOI while OppGA20 uses OWA_GA20.

The reason why I use while not playing with/against data is because I don’t want to have the talent level of the player we are evaluating influencing his own QoT and QoC metrics (which is essentially what TMGF20, TMGA20, OppGF20, OppGA20 are).

Calculate first iteration of HARO and HARD

The first iteration of HARO and HARD are simple. I first calculate an estimated GF20 and an estimated GA20 based on the players teammates and opposition.

ExpGF20 = (TMGF20 + OppGA20)/2
ExpGA20 = (TMGA20 + OppGF20)/2

Then I calculate HARO and HARD as a percentage improvement:

HARO(1st iteration) = 100*(GF20-ExpGF20) / ExpGF20
HARD(1st iteration) = 100*(ExpGA20 – GA20) / ExpGA20

So, a HARO of 20 would mean that when the player is on the goal rate of his team is 20% higher than one would expect based on how his teammates and opponents performed during time when the player is not on the ice with/against them. Similarly, a HARD of 20 would mean the goals against rate of his team is 20% better (lower) than expected.

(Note: The OppGA20 that gets used is from the complimentary situation. For 5v5 this means the opposition situation is also 5v5 but when calculating a rating for 5v5 leading the opposition situation is 5v5 trailing so OppGF20 would be OppGF20 calculated from 5v5 trailing data).

Now for a second iteration

The first iteration used GF20 and GA20 stats which is a good start but after the first iteration we have teammate and opponent corrected evaluations of every player which means we have better data about the quality of teammates and opponents the player has. This is where things get a little more complicated because I need to calculate a QoT and QoC metric based on the first iteration HARO and HARD values and then I need to convert that into a GF20 and GA20 equivalent number so I can compare the players GF20 and GA20 to.

To do this I calculate a TMHARO rating which is a TWA_TOI weighted average of first iteration HARO. TMHARD and OppHARO and OppHARD are calculated in a similar manner. TMHARD, OppHARO and OppHARD are similarly calculated. Now I need to convert these to GF20 and GA20 based stats so I do that by multiplying by league average GF20 (LAGF20) and league average GA20 (LAGA20) and from here I can calculated expected GF20 and expected GA20.

ExpGF20(2nd iteration) = (TMHARO*LAGF20 + OppHARD*LAGA20)/2
ExpGA20(2nd iteration) = (TMHARD*LAGA20 + OppHARD*LAGF20)/2

From there we can get a second iteration of HARO and HARD.

HARO(2nd iteration) = 100*(GF20-ExpGF20) / ExpGF20
HARD(2nd iteration) = 100*(ExpGA20 – GA20) / ExpGA20

Now we iterate again and again…

Now we repeat the above step over and over again using the previous iterations HARO and HARD values at every step.

Now calculate HART

Once we have done enough iterations we can calculate HART from the final iterations HARO and HARD values.

HART = (HARO + HARD) /2

Now do the same for Shot, Fenwick and Corsi data

The above is for goal ratings but I have Shot, Fenwick and Corsi ratings as well and these can be calculated in the exact same way except using SF20, SA20, FF20, FA20, CF20 and CA20.

What about goalies?

Goalies are a little unique in that they only really play the defensive side of the game. For this reason I do not include goalies in calculating TMGF20 and OppGF20. For shot, fenwick and corsi I do not include the goalies on the defensive side of things either as I assume a goalie will not influence shots against (though this may not be entirely true as some goalies may be better at controlling rebounds and thus secondary shots but I’ll assume this is a minimal effect if it does exist). The result of this is goalies do have a HARD rating but no HARO, or shot/fenwick/corsi based HARD or HARO rating.

I hope this helps explain how my hockey analysis ratings are calculated but if you have any followup questions feel free to ask them in the comments.

 

Apr 052013
 

I often get asked questions about hockey analytics, hockey fancy stats, how to use them, what they mean, etc. and there are plenty of good places to find definitions of various hockey stats but sometimes what is more important than a definition is some guidelines on how to use them. So, with that said, here are several tips that I have for people using advanced hockey stats.

Don’t over value Quality of Competition

I don’t know how often I’ll point out one players poor stats or another players good stats and immediately get the response “Yeah, but he always plays against the opponents best players” or “Yeah, but he doesn’t play against the oppositions best players” but most people that say that kind of thing have no real idea how much quality of opponent will affect the players statistics. The truth is it is not nearly as much as you might think.  Despite some coaches desperately trying to employ line matching techniques the variation in quality of competition metric is dwarfed by variation in quality of teammates, individual talent, and on-ice results. An analysis of Pavel Datsyuk and Valterri Filppula showed that if Filppula had Datsyuk’s quality of competition his CorsiFor% would drop from 51.05% to 50.90% and his GoalsFor% would drop from 55.65% to 55.02%. In the grand scheme of things, this are relatively minor factors.

Don’t over value Zone Stats either

Like quality of competition, many people will use zone starts to justify a players good/poor statistics. The truth is zone starts are not a significant factor either. I have found that the effect of zone starts is largely eliminated after about 10 seconds after a face off and this has been found true by others as well. I account for zone starts in statistics by eliminating the 10 seconds after an offensive or defensive zone face off and I have found doing this has relatively little effect on a players stats. Henrik Sedin is maybe the most extreme case of a player getting primarily offensive zone starts and all those zone starts took him from a 55.2 fenwick% player to a 53.8% fenwick% player when zone starts are factored out. In the most extreme case there is only a 1.5% impact on a players fenwick% and the majority of players are no where close to the zone start bias of Henrik Sedin. For the majority of players you are probably talking something under 0.5% impact on their fenwick%. As for individual stats over the last 3 seasons H. Sedin had 34 goals and 172 points in 5v5 situations and just 2 goals and 14 points came within 10 seconds of a zone face off, or about 5 points a year. If instead of 70% offensive zone face off deployment he had 50% offensive zone face off deployment instead of having 14 points during the 10 second zone face off time he may have had 10.  That’s a 4 point differential over 3 years for a guy who scored 172 points. In simple terms, about 2.3% of H. Sedin’s 5v5 points can be attributed to his offensive zone start bias.

A derivative of this is that if zone starts don’t matter much, a players face off winning percentage probably doesn’t matter much either which is consistent with other studies. It’s a nice skill to have, but not worth a lot either.

Do not ignore Quality of Teammates

I have just told you to pretty much ignore quality of competition and zone starts, what about quality of teammates? Well, to put it simply, do not ignore them. Quality of teammates matters and matters a lot. Sticking with the Vancouver Canucks, lets use Alex Burrows as an example. Burrows mostly plays with the Sedin twins but has played on Kesler’s line a bit too. Over the past 3 seasons he has played about 77.9% of his ice time with H. Sedin and about 12.3% of his ice time with Ryan Kesler and the reminder with Malhotra and others. Burrow’s offensive production is significantly better when playing with H. Sedin as 88.7% of his goals and 87.2% of his points came during the 77.9% ice time he played with H. Sedin. If Burrows played 100% of his ice time with H. Sedin and produced at the same rate he would have scored 6 (9.7%) more goals and 13 (11%) more 5v5 points over the past 3 seasons. This is far more significant than the 2.3% boost H. Sedin saw from all his offensive zone starts and I am not certain my Burrows example is the most extreme example in the NHL. How many more points would an average 3rd line get if they played mostly with H. Sedin instead of the average 3rd liner. Who you play with matters a lot. You can’t look at Tyler Bozak’s decent point totals and conclude he is a decent player without considering he plays a lot with Kessel and Lupul, two very good offensive players.

Opportunity is not talent

Kind of along the same lines as the Quality of Teammates discussion, we must be careful not to confuse opportunity and results. Over the past 2 seasons Corey Perry has the second most goals of any forward in the NHL trailing only Steven Stamkos. That might seem impressive but it is a little less so when you consider Perry also had the 4th most 5v5 minutes during that time and the 11th most 5v4 minutes.  Perry is a good goal scorer but a lot of his goals come from opportunity (ice time) as much as individual talent. Among forwards with at least 1500 minutes of 5v5 ice time the past 2 seasons, Perry ranks just 30th in goals per 60 minutes of ice time. That’s still good, but far less impressive than second only to Steven Stamkos and he is actually well behind teammate Bobby Ryan (6th) in this metric. Perry is a very good player but he benefits more than others by getting a lot of ice time  and PP ice time. Perry’s goal production is a large part talent, but also somewhat opportunity driven and we need to keep this in perspective.

Don’t ignore the percentages (shooting and save)

The percentages matter, particularly shooting percentages. I have shown that players can sustain elevated on-ice shooting percentages and I have shown that players can have an impact on their line mates shooting percentages and Tom Awad has shown that a significant portion of the difference between good players and bad players is finishing ability (shooting percentage).  There is even evidence that goal based metrics (which incorporate the percentages) are a better predictor of post season success than fenwick based metric. What corsi/fenwick metrics have going for them is more reliability over small sample sizes but once you approach a full seasons worth of data that benefit is largely gone and you get more benefit from having the percentages factored into the equation. If you want to get a better understanding of what considering the percentages can do for you, try to do a Malkin vs Gomez comparison or a Crosby vs Tyler Kennedy comparison over the past several years. Gomez and Kennedy actually look like relatively decent comparisons if you just consider shot based metrics, but both are terrible percentage players while Malkin and Crosby are excellent percentage players and it is the percentages that make Malkin and Crosby so special. This is an extreme example but the percentages should not be ignored if you want a true representation of a players abilities.

More is definitely better

One of the reason many people have jumped on the shot attempt/corsi/fenwick band wagon is because they are more frequent events than goals and thus give you more reliable metrics. This is true over small sample sizes but as explained above, the percentages matter too and should not be ignored. Luckily, for most players we have ample data to get past the sample size issues. There is no reason to evaluate a player based on half a seasons data if that player has been in the league for several years. Look at 2, 3, 4 years of data.  Look for trends. Is the player consistently a higher corsi player? Is the player consistently a high shooting percentage player? Is the player improving? Declining? I have shown on numerous occassions that goals are a better predictor of future goal rates than corsi/fenwick starting at about one year of data but multiple years are definitely better. Any conclusion about a players talent level using a single season of data or less (regardless of whether it is corsi or goal based) is subject to a significant level of uncertainty. We have multiple years of data for the majority of players so use it. I even aggregate multiple years into one data set for you on stats.hockeyanalysis.com for you so it isn’t even time consuming. The data is there, use it. More is definitely better.

WOWY’s are where it is at

In my mind WOWY’s are the best tool for advanced player evaluation. WOWY stands for with or without you and looks at how a player performs while on the ice with a team mate and while on the ice without a team mate. What WOWY’s can tell you is whether a particular player is a core player driving team success or a player along for the ride. Players that consistently make their team mates statistics better when they are on the ice with them are the players you want on your team. Anze Kopitar is an example of a player who consistently makes his teammates better. Jack Johnson is an example of a player that does not, particularly when looking at goal based metrics.   Then there are a large number of players that are good players that neither drive your teams success nor hold it back, or as I like to say, complementary players. Ideally you build your team around a core of players like Kopitar that will drive success and fill it in with a group of complementary players and quickly rid yourself of players like Jack Johnson that act as drags on the team.

 

Apr 052013
 

Yesterday HabsEyesOnThePrize.com had a post on the importance of fenwick come playoff time over the past 5 seasons. It is definitely worth a look so go check it out. In the post they look at FF% in 5v5close situations and see how well it translates into post season success. I wanted to take this a step further and take a look at PDO and GF% in 5v5close situations to see of they translate into post season success as well.  Here is what I found:

Group N Avg Playoff Avg Cup Winners Lost Cup Finals Lost Third Round Lost Second Round Lost First Round Missed Playoffs
GF% > 55 19 2.68 2.83 5 1 2 6 4 1
GF% 50-55 59 1.22 1.64 0 2 6 10 26 15
GF% 45-50 52 0.62 1.78 0 2 2 4 10 34
GF% <45 20 0.00 - 0 0 0 0 0 20
FF% > 53 23 2.35 2.35 3 2 4 5 9 0
FF% 50-53 55 1.15 1.70 2 2 1 10 22 18
FF% 47-50 46 0.52 1.85 0 0 4 3 6 33
FF% <47 26 0.54 2.00 0 1 1 2 3 19
PDO >1010 27 1.63 2.20 2 2 2 6 8 7
PDO 1000-1010 42 1.17 1.75 1 0 5 7 15 14
PDO 990-1000 47 0.91 1.95 2 1 3 4 12 25
PDO <990 34 0.56 1.90 0 2 0 3 5 24

I have grouped GF%, FF% and PDO into four categories each, the very good, the good, the mediocre and the bad and I have looked at how many teams made it to each round of the playoffs from each group. If we say that winning the cup is worth 5 points, getting to the finals is worth 4, getting to the 3rd round is worth 3, getting to the second round is worth 2, and making the playoffs is worth 1, then the Avg column is the average point total for the teams in that grouping.  The Playoff Avg is the average point total for teams that made the playoffs.

As HabsEyesOnThePrize.com found, 5v5close FF% is definitely an important factor in making the playoffs and enjoying success in the playoffs. That said, GF% seems to be slightly more significant. All 5 Stanley Cup winners came from the GF%>55 group while only 3 cup winners came from the FF%>53 group and both Avg and PlayoffAvg are higher in the GF%>55 group than the FF%>53 group. PDO only seems marginally important, though teams that have a very good PDO do have a slightly better chance to go deeper into the playoffs. Generally speaking though, if you are trying to predict a Stanley Cup winner, looking at 5v5close GF% is probably a better metric than looking at 5v5close FF% and certainly better than PDO. Now, considering this is a significantly shorter season than usual, this may not be the case as luck may be a bit more of a factor in GF% than usual but historically this has been the case.

So, who should we look at for playoff success this season?  Well, there are currently 9 teams with a 5v5close GF% > 55.  Those are Anaheim, Boston, Pittsburgh, Los Angeles, Montreal, Chicago, San Jose, Toronto and Vancouver. No other teams are above 52.3% so that is a list unlikely to get any new additions to it before seasons end though some could certainly fall out of the above 55% list. Now if we also only consider teams that have a 5v5close FF% >50% then Toronto and Anaheim drop off the list leaving you with Boston, Pittsburgh, Los Angeles, Montreal, Chicago, San Jose and Vancouver as your Stanley Cup favourites, but we all pretty much knew that already didn’t we?

 

Feb 272013
 

The last several days I have been playing around a fair bit with team data and analyzing various metrics for their usefulness in predicting future outcomes and I have come across some interesting observations. Specifically, with more years of data, fenwick becomes significantly less important/valuable while goals and the percentages become more important/valuable. Let me explain.

Let’s first look at the year over year correlations in the various stats themselves.

Y1 vs Y2 Y12 vs Y34 Y123 vs Y45
FF% 0.3334 0.2447 0.1937
FF60 0.2414 0.1635 0.0976
FA60 0.3714 0.2743 0.3224
GF% 0.1891 0.2494 0.3514
GF60 0.0409 0.1468 0.1854
GA60 0.1953 0.3669 0.4476
Sh% 0.0002 0.0117 0.0047
Sv% 0.1278 0.2954 0.3350
PDO 0.0551 0.0564 0.1127
RegPts 0.2664 0.3890 0.3744

The above table shows the r^2 between past events and future events.  The Y1 vs Y2 column is the r^2 between subsequent years (i.e. 0708 vs 0809, 0809 vs 0910, 0910 vs 1011, 1011 vs 1112).  The Y12 vs Y23 is a 2 year vs 2 year r^2 (i.e. 07-09 vs 09-11 and 08-10 vs 10-12) and the Y123 vs Y45 is the 3 year vs 2 year comparison (i.e. 07-10 vs 10-12). RegPts is points earned during regulation play (using win-loss-tie point system).

As you can see, with increased sample size, the fenwick stats abilitity to predict future fenwick stats diminishes, particularly for fenwick for and fenwick %. All the other stats generally get better with increased sample size, except for shooting percentage which has no predictive power of future shooting percentage.

The increased predictive nature of the goal and percentage stats with increased sample size makes perfect sense as the increased sample size will decrease the random variability of these stats but I have no definitive explanation as to why the fenwick stats can’t maintain their predictive ability with increased sample sizes.

Let’s take a look at how well each statistic correlates with regulation points using various sample sizes.

1 year 2 year 3 year 4 year 5 year
FF% 0.3030 0.4360 0.5383 0.5541 0.5461
GF% 0.7022 0.7919 0.8354 0.8525 0.8685
Sh% 0.0672 0.0662 0.0477 0.0435 0.0529
Sv% 0.2179 0.2482 0.2515 0.2958 0.3221
PDO 0.2956 0.2913 0.2948 0.3393 0.3937
GF60 0.2505 0.3411 0.3404 0.3302 0.3226
GA60 0.4575 0.5831 0.6418 0.6721 0.6794
FF60 0.1954 0.3058 0.3655 0.4026 0.3951
FA60 0.1788 0.2638 0.3531 0.3480 0.3357

Again, the values are r^2 with regulation points.  Nothing too surprising there except maybe that team shooting percentage is so poorly correlated with winning because at the individual level it is clear that shooting percentages are highly correlated with goal scoring. It seems apparent from the table above that team save percentage is a significant factor in winning (or as my fellow Leaf fans can attest to, lack of save percentage is a significant factor in losing).

The final table I want to look at is how well a few of the stats are at predicting future regulation time point totals.

Y1 vs Y2 Y12 vs Y34 Y123 vs Y45
FF% 0.2500 0.2257 0.1622
GF% 0.2214 0.3187 0.3429
PDO 0.0256 0.0534 0.1212
RegPts 0.2664 0.3890 0.3744

The values are r^2 with future regulation point totals. Regardless of time frame used, past regulation time point totals are the best predictor of future regulation time point totals. Single season FF% is slightly better at predicting following season regulation point totals but with 2 or more years of data GF% becomes a significantly better predictor as the predictive ability of GF% improves and FF% declines. This makes sense as we earlier observed that increasing sample size improves GF% predictability of future GF% while FF% gets worse and that GF% is more highly correlated with regulation point totals than FF%.

One thing that is clear from the above tables is that defense has been far more important to winning than offense. Regardless of whether we look at GF60, FF60, or Sh% their level of importance trails their defensive counterpart (GA60, FA60 and Sv%), usually significantly. The defensive stats more highly correlate with winning and are more consistent from year to year. Defense and goaltending wins in the NHL.

What is interesting though is that this largely differs from what we see at the individual level. At the individual level there is much more variation in the offensive stats indicating individual players have more control over the offensive side of the game. This might suggest that team philosophies drive the defensive side of the game (i.e. how defensive minded the team is, the playing style, etc.) but the offensive side of the game is dominated more by the offensive skill level of the individual players. At the very least it is something worth of further investigation.

The last takeaway from this analysis is the declining predictive value of fenwick/corsi with increased sample size. I am not quite sure what to make of this. If anyone has any theories I’d be interested in hearing them. One theory I have is that fenwick rates are not a part of the average GMs player personal decisions and thus over time as players come and go any fenwick rates will begin to vary. If this is the case, then this may represent an area of value that a GM could exploit.

 

Feb 212013
 

Over the past few years I have had a few discussions with other Leaf fans about the relative merits of Francois Beauchemin. Many Leaf fans argue that he was a good 2-way defenseman who can play tough minutes and is the kind of defenseman the Leafs are still in need of. I on the other hand have never had quite as optimistic view of Beauchemin and I don’t think he would make this team any better.

On some level I think a part of the difference in opinion is that many look at his corsi numbers which aren’t too bad but I prefer to look at his goal numbers which have generally not been so good. So, let’s take a look at Beauchemin’s WOWY numbers and see if there is in fact a divergence between Beauchemin’s corsi WOWY numbers and his goal WOWY numbers starting with 2009-11 5v5 WOWY starting with CF% WOWY.

Beauchemin200911CFWOWY

I have included a diagonal line which is kind of a ‘neutral’ line where players perform equally well with and without Beauchemin. Anything to the right/below the line indicates the player played better with Beauchemin than without and anything to the left/above they played worse with Beauchemin. As you can see, the majority of players had a better CF% with Beauchemin than without. Now, let’s take a look at GF% WOWY.

Beauchemin200911GFWOWY

While a handful of players had better GF% with Beauchemin, the majority were a little worse off. There is a clear difference between Beauchemin’s CF% WOWY and his GF% WOWY. What is interesting is this difference can be observed in 2007-08, 2009-10, 2010-11, and 2011-12 (he was injured for much of 2008-09 so his WOWY data is not reliable due to smaller sample size). Looking at his 5-year WOWY charts you get a clear picture that Beauchemin seemingly has a skill for ‘driving play’ but not ‘driving goals’. Let’s dig a little further to see if we can determine what his ‘problem’ by looking at his 2009-11 two year CF20, GF20, CA20 and GA20 WOWY’s.

CF20:

Beauchemin200911CF20WOWY

GF20:

Beauchemin200911GF20WOWY

As you can clearly see, Beauchemin appears to be much better at generating shots and shot attempts than he is at generating goals. The majority of players have a higher corsi for rate when with Beauchemin than when not with Beauchemin but the majority also have a lower goals for rate. What about ‘against’ rates?

CA20:

Beauchemin200911CA20WOWY

GA20:

Beauchemin200911GA20WOWY

For CA20 and GA20 is is better to be to be above/left of the diagional line because unlike GF%/CF%/GF20/CF20 it is better to have a smaller number than a larger number. There doesn’t seem to be quite as much of a difference between CA20 and GA20 as with CF20 and GF20 so the difference between CF% and GF% is driven by the inability to convert shots and shot attempts into goals as opposed to the defensive side of the game. That said, there is no clear evidence that Beauchemin makes his teammates any better defensively.

There are two points I wanted to make with this post.

  1. Leaf fans probably shouldn’t be missing Beauchemin.
  2. For a lot of players a corsi evaluation of that player will give you a reasonable evaluation of that player but there are also many players where a corsi evaluation of that player will not tell the complete story. Some players can consistently see a divergence between their goal stats and their corsi stats and it is important to take that into consideration.

 

Jun 262012
 

I have had a lot of battles with the pro-corsi crowd with regards to the merits of using Corsi as a player evaluation tool.  I still get people dismissing my goal based analysis (which seems really strange since goals are what matters in hockey) so I figured I should summarize my position in one easy to understand post.  So, with that, here are 10 significant reasons why I don’t like to use a corsi based player analysis.

1.  Look at the list of players with the top on-ice shooting percentage over the past 5 seasons and compare it to the list of players with the top corsi for per 20 minutes of ice time and you’ll find that the shooting percentage list is far more representative of top offensive players than the top corsi for list.

2.  Shooting percentage is a talent and is sustainable and three year shooting percentage is as good a predictor of the following 2 seasons goal scoring rates as 3 year fenwick rates and 3 year goal rates are a far better predictor.

r^2
2007-10 FF20 vs 2010-12 GF20 0.253
2007-10 SH% vs 2010-12 GF20 0.244
2007-10 GF20 vs 2010-12 GF20 0.363

3.  I have even shown that one year GF20 is on average as good a predictor of  the following seasons GF20 as FF20 is as a predictor of the following seasons FF20 so with even just one full season of data goal rates are as good a metric of offensive talent as fenwick rate is.  Only when the sample size is less than one season (and for almost all NHL regulars we have at least a seasons worth of data) is fenwick rate a better metric for evaluating offensive talent.

4.  Although difficult to identify, I believe I have shown players can suppress opposition shooting percentage.

5.  Zone starts affect shots/corsi/fenwick stats significantly more than they affect goal stats thus the non-adjusted shot/corsi/fenwick data are less useful than the non-adjusted goal data.

6.  Although not specifically a beef with Corsi, much of the corsi analysis currently being done does not split out offensive corsi and defensive corsi but rather looks at them as a percentage or as a +/- differential.  I believe this is a poor way of doing analysis because it really is useful to know whether a player is good because he produces a lot of offense or whether the player is good because he is great defensively.  Plus, when evaluating a player offensively we need to consider the offensive capability of his team mates and the defensive capability of his opposition, not the overall ability of those players.

7.  I have a really hard time believing that 8 of the top 9 corsi % players over the past 5 seasons are Red Wing players because they are all really talented and had nothing to do with the system they play or some other non-individual talent factor.

8.  Try doing a Malkin vs Gomez fenwick/corsi comparison and now do the same with goals.  Gomez actually has a very good and very comparable fenwick rating to Malkin, but Malkin is a far better player at producing goals thanks to his far superior on-ice shooting percentage (FSh% = fenwick shooting percentage = goals / fenwick for).  Gomez every single season has a much poorer on-ice shooting percentage than Malkin and this is why Malkin is the far better player.  Fenwick/Corsi doesn’t account for this.

Malkin Gomez Malkin Gomez Malkin Gomez
Season(s) FF20 FF20 GF20 GF20 FSh% FSh%
2011-12 16.5 14.0 1.301 0.660 7.9% 4.7%
2010-11 16.1 16.4 0.949 0.534 5.9% 3.3%
2009-10 15.3 14.2 1.112 0.837 7.3% 5.9%
2008-09 12.4 16.8 1.163 0.757 9.4% 4.5%
2007-08 14.1 15.9 1.206 0.792 8.5% 5.0%
2007-11 14.7 14.7 1.171 0.745 8.0% 5.1%

 

So there you have it.  Those are some of the main reasons why I don’t use corsi in player analysis.  This isn’t to say Corsi isn’t a useful metric.  It is a useful metric in identifying which players are better at controlling play. Unfortunately, controlling play is only part of the game so if you want to conduct a complete thorough evaluation of a player, goal based stats are required.

 

Apr 192012
 

Prior to the season Gabe Desjardins and I had a conversation over at MC79hockey.com where I predicted several players would combine for a 5v5 on-ice shooting percentage above 10.0% while league average is just shy of 8.0%.  I documented this in a post prior to the season.  In short, I predicted the following:

  • Crosby, Gaborik, Ryan, St. Louis, H. Sedin, Toews, Heatley, Tanguay, Datsyuk, and Nathan Horton will have a combined on-ice shooting percentage above 10.0%
  • Only two of those 10 players will have an on-ice shooting percentage below 9.5%

So, how did my prediction fair?  The following table tells all.

Player GF SF SH%
SIDNEY CROSBY 31 198 15.66%
MARTIN ST._LOUIS 74 601 12.31%
ALEX TANGUAY 43 371 11.59%
MARIAN GABORIK 57 582 9.79%
JONATHAN TOEWS 51 525 9.71%
NATHAN HORTON 34 359 9.47%
HENRIK SEDIN 62 655 9.47%
BOBBY RYAN 52 552 9.42%
PAVEL DATSYUK 50 573 8.73%
DANY HEATLEY 42 611 6.87%
Totals 496 5027 9.87%

Well, technically neither of my predictions came true.  Only 5 players had on-ice shooting percentages above 9.5% and as a group they did not maintain a shooting percentage above 10.0%.  That said, my prediction wasn’t all that far off.  8 of the 10 players had an on-ice shooting percentage above 9.42% and as a group they had an on-ice shooting percentage of 9.87%.  If Crosby was healthy for most of the season or the Minnesota Wild didn’t suck so bad the group would have reached the 10.0% mark.  So, when all is said and done, while technically my predictions didn’t come perfectly true, the intent of the prediction did.  Shooting percentage is a talent, is maintainable, and can be used as a predictor of future performance.

I now have 5 years of on-ice data on stats.hockeyanalysis.com so I thought I would take a look at how sustainable shooting percentage is using that data.  To do this I took all forwards with 350 minutes of 5v5 zone start adjusted ice time in each of the past 5 years and took the first 3 years of the data (2007-08 through 2009-10) to predict the final 2 years of data (2010-11 and 2011-12).  This means we used at least 1050 minutes of data over 3 seasons to predict at least 700 minutes of data over 2 seasons.  The following chart shows the results for on-ice shooting percentage.

Clearly there is some persistence in on-ice shooting percentage.  How does this compare to something like fenwick for rates (using FF20 – Fenwick For per 20 minutes).

Ok, so FF20 seems to be more persistent, but that doesn’t take away from the fact that shooting percentage is persistent and a reasonable predictor of future shooting percentage.  (FYI, the guy out on his own in the upper left is Kyle Wellwood)

The real question is, are either of them any good at predicting future goal scoring rates (GF20 – goals for per 20 minutes) because really, goals are ultimately what matters in hockey.

Ok, so both on-ice shooting percentage and on-ice fenwick for rates are somewhat reasonable predictors of future on-ice goal for rates with a slight advantage to on-ice shooting percentage (sorry, just had to point that out).  This is not inconsistent with what I  found a year ago when I used 4 years of data to calculate 2 year vs 2 year correlations.

Of course, I would never suggest we use shooting percentage as a player evaluation tool, just as I don’t suggest we use fenwick as a player evaluation tool.  Both are sustainable, both can be used as predictors of future success, and both are true player skills, but the best predictor of future goal scoring is past goal scoring, as evidenced by the following chart.

That is pretty clear evidence that goal rates are the best predictor of future goal rates and thus, in my opinion anyway, the best player evaluation tool.  Yes, there are still sample size issues with using goal rates for less than a full seasons worth of data, but for all those players where we have multiple seasons worth of data (or at least one full season with >~750 minutes of ice time) for, using anything other than goals as your player evaluation tool will potentially lead to less reliable and less accurate player evaluations.

As for the defensive side of the game, I have not found a single reasonably good predictor of future goals against rates, regardless of whether I look at corsi, fenwick, goals, shooting percentage or anything else.  This isn’t to suggest that players can’t influence defense, because I believe they can, but rather that there are too many other factors that I haven’t figured out how to isolate and remove from the equation.  Most important is the goalie and I feel the most difficult question to answer in hockey statistics is how to separate the goalie from the defenders. Plus, I believe there are far fewer players that truly focus on defense and thus goals against is largely driven by the opposition.

Note:  I won’t make any promises but my intention is to make this my last post on the subject of sustainability of on-ice shooting percentage and the benefit of using a goal based player analysis over a corsi/fenwick based analysis.  For all those who still fail to realize goals matter more than shots or shot attempts there is nothing more I can say.  All the evidence is above or in numerous other posts here at hockeyanalysis.com.  On-ice shooting percentage is a true player talent that is both sustainable and a viable predictor of future performance at least on par with fenwick rates.  If you choose to ignore reality from this point forward, it is at your own peril.