Apr 052013
 

I often get asked questions about hockey analytics, hockey fancy stats, how to use them, what they mean, etc. and there are plenty of good places to find definitions of various hockey stats but sometimes what is more important than a definition is some guidelines on how to use them. So, with that said, here are several tips that I have for people using advanced hockey stats.

Don’t over value Quality of Competition

I don’t know how often I’ll point out one players poor stats or another players good stats and immediately get the response “Yeah, but he always plays against the opponents best players” or “Yeah, but he doesn’t play against the oppositions best players” but most people that say that kind of thing have no real idea how much quality of opponent will affect the players statistics. The truth is it is not nearly as much as you might think.  Despite some coaches desperately trying to employ line matching techniques the variation in quality of competition metric is dwarfed by variation in quality of teammates, individual talent, and on-ice results. An analysis of Pavel Datsyuk and Valterri Filppula showed that if Filppula had Datsyuk’s quality of competition his CorsiFor% would drop from 51.05% to 50.90% and his GoalsFor% would drop from 55.65% to 55.02%. In the grand scheme of things, this are relatively minor factors.

Don’t over value Zone Stats either

Like quality of competition, many people will use zone starts to justify a players good/poor statistics. The truth is zone starts are not a significant factor either. I have found that the effect of zone starts is largely eliminated after about 10 seconds after a face off and this has been found true by others as well. I account for zone starts in statistics by eliminating the 10 seconds after an offensive or defensive zone face off and I have found doing this has relatively little effect on a players stats. Henrik Sedin is maybe the most extreme case of a player getting primarily offensive zone starts and all those zone starts took him from a 55.2 fenwick% player to a 53.8% fenwick% player when zone starts are factored out. In the most extreme case there is only a 1.5% impact on a players fenwick% and the majority of players are no where close to the zone start bias of Henrik Sedin. For the majority of players you are probably talking something under 0.5% impact on their fenwick%. As for individual stats over the last 3 seasons H. Sedin had 34 goals and 172 points in 5v5 situations and just 2 goals and 14 points came within 10 seconds of a zone face off, or about 5 points a year. If instead of 70% offensive zone face off deployment he had 50% offensive zone face off deployment instead of having 14 points during the 10 second zone face off time he may have had 10.  That’s a 4 point differential over 3 years for a guy who scored 172 points. In simple terms, about 2.3% of H. Sedin’s 5v5 points can be attributed to his offensive zone start bias.

A derivative of this is that if zone starts don’t matter much, a players face off winning percentage probably doesn’t matter much either which is consistent with other studies. It’s a nice skill to have, but not worth a lot either.

Do not ignore Quality of Teammates

I have just told you to pretty much ignore quality of competition and zone starts, what about quality of teammates? Well, to put it simply, do not ignore them. Quality of teammates matters and matters a lot. Sticking with the Vancouver Canucks, lets use Alex Burrows as an example. Burrows mostly plays with the Sedin twins but has played on Kesler’s line a bit too. Over the past 3 seasons he has played about 77.9% of his ice time with H. Sedin and about 12.3% of his ice time with Ryan Kesler and the reminder with Malhotra and others. Burrow’s offensive production is significantly better when playing with H. Sedin as 88.7% of his goals and 87.2% of his points came during the 77.9% ice time he played with H. Sedin. If Burrows played 100% of his ice time with H. Sedin and produced at the same rate he would have scored 6 (9.7%) more goals and 13 (11%) more 5v5 points over the past 3 seasons. This is far more significant than the 2.3% boost H. Sedin saw from all his offensive zone starts and I am not certain my Burrows example is the most extreme example in the NHL. How many more points would an average 3rd line get if they played mostly with H. Sedin instead of the average 3rd liner. Who you play with matters a lot. You can’t look at Tyler Bozak’s decent point totals and conclude he is a decent player without considering he plays a lot with Kessel and Lupul, two very good offensive players.

Opportunity is not talent

Kind of along the same lines as the Quality of Teammates discussion, we must be careful not to confuse opportunity and results. Over the past 2 seasons Corey Perry has the second most goals of any forward in the NHL trailing only Steven Stamkos. That might seem impressive but it is a little less so when you consider Perry also had the 4th most 5v5 minutes during that time and the 11th most 5v4 minutes.  Perry is a good goal scorer but a lot of his goals come from opportunity (ice time) as much as individual talent. Among forwards with at least 1500 minutes of 5v5 ice time the past 2 seasons, Perry ranks just 30th in goals per 60 minutes of ice time. That’s still good, but far less impressive than second only to Steven Stamkos and he is actually well behind teammate Bobby Ryan (6th) in this metric. Perry is a very good player but he benefits more than others by getting a lot of ice time  and PP ice time. Perry’s goal production is a large part talent, but also somewhat opportunity driven and we need to keep this in perspective.

Don’t ignore the percentages (shooting and save)

The percentages matter, particularly shooting percentages. I have shown that players can sustain elevated on-ice shooting percentages and I have shown that players can have an impact on their line mates shooting percentages and Tom Awad has shown that a significant portion of the difference between good players and bad players is finishing ability (shooting percentage).  There is even evidence that goal based metrics (which incorporate the percentages) are a better predictor of post season success than fenwick based metric. What corsi/fenwick metrics have going for them is more reliability over small sample sizes but once you approach a full seasons worth of data that benefit is largely gone and you get more benefit from having the percentages factored into the equation. If you want to get a better understanding of what considering the percentages can do for you, try to do a Malkin vs Gomez comparison or a Crosby vs Tyler Kennedy comparison over the past several years. Gomez and Kennedy actually look like relatively decent comparisons if you just consider shot based metrics, but both are terrible percentage players while Malkin and Crosby are excellent percentage players and it is the percentages that make Malkin and Crosby so special. This is an extreme example but the percentages should not be ignored if you want a true representation of a players abilities.

More is definitely better

One of the reason many people have jumped on the shot attempt/corsi/fenwick band wagon is because they are more frequent events than goals and thus give you more reliable metrics. This is true over small sample sizes but as explained above, the percentages matter too and should not be ignored. Luckily, for most players we have ample data to get past the sample size issues. There is no reason to evaluate a player based on half a seasons data if that player has been in the league for several years. Look at 2, 3, 4 years of data.  Look for trends. Is the player consistently a higher corsi player? Is the player consistently a high shooting percentage player? Is the player improving? Declining? I have shown on numerous occassions that goals are a better predictor of future goal rates than corsi/fenwick starting at about one year of data but multiple years are definitely better. Any conclusion about a players talent level using a single season of data or less (regardless of whether it is corsi or goal based) is subject to a significant level of uncertainty. We have multiple years of data for the majority of players so use it. I even aggregate multiple years into one data set for you on stats.hockeyanalysis.com for you so it isn’t even time consuming. The data is there, use it. More is definitely better.

WOWY’s are where it is at

In my mind WOWY’s are the best tool for advanced player evaluation. WOWY stands for with or without you and looks at how a player performs while on the ice with a team mate and while on the ice without a team mate. What WOWY’s can tell you is whether a particular player is a core player driving team success or a player along for the ride. Players that consistently make their team mates statistics better when they are on the ice with them are the players you want on your team. Anze Kopitar is an example of a player who consistently makes his teammates better. Jack Johnson is an example of a player that does not, particularly when looking at goal based metrics.   Then there are a large number of players that are good players that neither drive your teams success nor hold it back, or as I like to say, complementary players. Ideally you build your team around a core of players like Kopitar that will drive success and fill it in with a group of complementary players and quickly rid yourself of players like Jack Johnson that act as drags on the team.

 

Feb 222012
 

Looking at this chart, I think only Lightning fans can sympathize with the torture that Leaf fans have suffered through with regards to their goaltending, but at least the Lightning have made the playoffs a few times and even had some success.

Update:  For interest sake, here are the post lockout shooting percentages and PDO (shooting percentage + save percentage).


 

 

Feb 052012
 

One of my beefs in the analysis and evaluation of hockey players is the notion that PDO (on-ice shooting percentage plus on-ice save percentage) can be used as a proxy for luck.  A perfect example of how PDO is used as a proxy for luck is this article by Neil Greenberg about the Washington Capitals.

For example, when Alex Ovechkin has been on the ice during even strength this season, the team has a shooting percentage of 8.2 percent and has saved shots at a rate of .917. So that makes his PDO value 999 (.082+.917=.999), which is almost exactly the league average. In other words, Ovechkin has seen neither very good nor very bad “puck luck” this season.

What’s useful about this metric is that it’s “unstable,” and over a large-enough sample will regress to 1000. Why 1000? Because every shot that is a goal is a shot not saved, and vice versa.

My beef with such an analysis is the notion that for all players PDO regresses to 1000 and any players with PDO above 1000 are lucky  and any players with a PDO below 1000 are unlucky.  While I do believe luck can influence PDO over small sample sizes, not all players have a natural PDO level of 1000 and there are two reasons why.

1.  Not all players play in front of perfectly average goalies which will have a major impact on the save percentage portion of PDO.

2. Players can drive shooting percentages.

To show you what I mean on point 2, I took 4 years (2007-08 to 2010-11) of 5v5 zone start adjusted data and grouped forwards based on their ice time over those 4 years and then calculated the on-ice shooting and save percentages and PDO for each group.  Here is what I found.

TOI (minutes) SH% SV% PDO
<500 7.5% 90.9% 983.5
500-999 7.9% 91.2% 991.2
1000-1499 8.0% 91.2% 992.2
1500-1999 8.2% 91.2% 993.4
2000-2499 8.6% 91.1% 997.0
2500-2999 9.0% 91.2% 1001.9
3000-3499 9.3% 91.2% 1004.4
3500-4000 9.8% 90.8% 1006.1
4000+ 10.4% 90.8% 1012.4

PDO varies from 983.5 up to 1012.4 depending on the group’s ice time.  This is largely driven by shooting percentage which varies from 7.5% to 10.4% with the players with the lowest amount of ice time having the lowest on-ice shooting percentage and the players with the most ice time having the highest shooting percentage.  Order is the enemy of luck so seeing shooting percentages ordered this nicely tells me something other than luck is happening.  Driving on-ice shooting percentage is a skill.  This means more talented players can have a natural PDO (the PDO that they should regress to) above 1000 and less talented players can have a nautral PDO below 1000.  Factor in the goaltending and a player could have a natural PDO well above or well below 1000.

Now, this is not to say that luck isn’t a factor in a players PDO, especially over small sample sizes, it’s just we can’t estimate that luck by assuming every players natural “regress to” PDO is 1000.  Daniel Sedin has a PDO of 1043 this season (through Thursday February 2nd).  Is it fair to suggest he has been luck and should see his PDO regress to 1000?  When you consider his4-year PDO is 1035 (and his 3 year PDO is 1054) probably not.  His natural, “regress to” PDO is probably not that far off his current 1043 PDO.  Now if you are talking about Todd Bertuzzi this season it’s a different story.  Through Thursday he had a a PDO of 1056 while his 4-year PDO is 994 and he hasn’t had a PDO above 1000 in any of the previous 3 seasons.  It is probably fair to presume that Bertuzzi’s natural regress to PDO is much closer to 1000, maybe even below 1000 in which case it is fair to conclude that Bertuzzi has probably been quite lucky so far this season and is unlikely to continue at this pace the remainder of the season.

When used properly PDO can be an indication of luck but to do so we need to consider the context of a players PDO, not just assume all players PDO’s will necessarily regress to 1000.