Sep 212015

This came up in a twitter conversation today and since I will be referencing both of these as part of my RIT Hockey Analytics Conference talk it might be a good idea to re-introduce them to anyone who are not familiar with them.

At their core, Rel and RelTM stats both attempt to account for the quality of teammates a player plays with. The problem with Corsi%, as Paul Bissonette points out, is that it is heavily driven by the players one plays with.

Ultimately, teams don’t draft based on Corsi or possession numbers. You draft a player because you’ve watched how they perform on the ice and then consider their potential to improve.

But somehow they’re starting to use this bullshit in contract negotiations. You have teams saying, “Oh, wow, look at this player in Chicago who had a 60 percent possession number.” Well, yeah, because he’s an average player playing with Toews and Kane. So all of the sudden a team signs him for $3 million a year even though he’s a $1.5 million a year player, and they’re shocked when his possession isn’t as good. Are you kidding me?

This is a flaw I think everyone agrees with which is why Rel and RelTM were developed. A player playing with all-star players will naturally put up (or at least have an opportunity to put up) far better numbers than players playing with mediocre 3rd and 4th line players. A player on the Blackhawks will have an opportunity to put up far better numbers than a player on the Sabres. The Blackhawks had a team Corsi% of 53.6% last year compared to the Sabres 37.5%. Rel and RelTM stats attempt to factor out the team aspect and isolate the individual contribution.

What is Rel?

Rel (or Relative) stats are calculated by taking the players “on-ice” stats (or what the team does when they player is on the ice) and comparing them to the players “off-ice” stats (or what the team does when they are not on the ice). So, CF% Rel is nothing more than on-ice CF% minus off-ice CF%. The resulting metric is an attempt to look at how the player is performing relative to the rest of the team.

The flaw with Rel stats is that it still doesn’t account for the player who plays with Jonathan Toews and Marian Hossa. The third guy on that line might not be a very good player but will look good because Toews and Hossa drive the play and boost his stats. Rel stats also fail when it comes to teams that are well balanced with no superstars but 3 very good lines where as a player on the first line of a very top heavy team will probably look better than they really should.

I’d say that Rel stats do a decent job of factoring out team-level system driven results but do not do a great job of isolating the players performance from his line mates.

What is RelTM?

RelTM (Relative to Team Mates) attempts to account for the 3rd guy on the Toews/Hossa line getting unfairly credited for playing with a pair of superstars. It does this by looking at how each of the players teammates perform with and apart from him and to what extent. So while Rel is a team level on-ice minus off-ice stat, RelTM is a player level with minus apart stat. Some people have referred to this as a “Combined WOWY” (combined with or without you) stat. It can be a little difficult to wrap your head around but it looks at Toews’ (and Hossa’s and every other teammate)  performance apart from the 3rd guy on the Toews/Hossa line and Toews’ performance with him and attempts to determine if Toews performed better or worse with him than apart from him. This gets done for every player and in essence RelTM tries to figure out how much on average he boosts his team mates statistics. The more team mates that have better statistics with him than apart from him the better.

While on some level this sounds like an improved statistic than Rel since it is directly looking at how much better the guys he actually plays with are with him than apart it isn’t perfect either. The most significant reason is probably that if you are the 3rd guy on a Toews/Hossa line it will be very difficult to boost the stats Toews/Hossa because they are so good. Another flaw is that often the result is that you end up just being compared to the next most important player at your position. When the regular first line RW isn’t playing with the regular first line C and LW it most likely will be the second line RW that is playing with them. So the RelTM stats to some extent tell you how much better/worse are you than the next best/worst player at your position on your team.


Which one of Rel and RelTM is better? I am a bit biased but I personally like how RelTM stats are calculated as I think looking at direct impact on the players teammates is a better route to go. With that said, there is some evidence that Rel is a little more persistent from year to year which may make it more useful. I’ll have more to say about this at the RIT Hockey Analytics conference but mostly I’ll show that both Rel and RelTM stats are almost certainly vastly superior to the raw stat counter part (i.e. CF% Rel and CF% RelTM are vastly superior to CF%). In different ways both attempt to and are relatively successful at factoring out quality of team and quality of teammates. Rel and RelTM stats are generally fairly highly correlated as well so there likely isn’t a huge difference between them. Furthermore, the more a team juggles its line up the better both of these stats, particularly the RelTM stat, will correctly isolate individual contribution and talent.

(if there is any confusion on these stats, feel free to ask questions in the comments. I’ll be happy to clarify and/or provide more details.)

You can find RelTM stats on my stats sites ( and and you can find Rel stats on many other stats sites (I plan on adding them to mine at some point as well).

Finally, if you have a chance to make it to Rochester, NY on October 10th consider signing up for and attending the RIT Hockey Analytics Conference. It looks like one of the best line ups of speakers these conferences have ever had with a wide variety of topics.


The Value of Outliers

 Uncategorized  Comments Off on The Value of Outliers
Jan 252015

Ryan Stimson has been doing some valuable work tracking passes and this morning he posted an interesting analysis of the data he (and others) have collected thus far. It is a very interesting article and definitely worth a read. It is a valuable contribution to shot quality research but the article created some twitter discussion regarding one of the techniques that Ryan used. In particular, when Stimson was looking at the correlation between two variables (i.e. passing ability vs shooting percentage) he noticed that there was often an outlier team and he would subsequently look at the correlation between the two variables while eliminating the outlier team. This technique of removing outliers generated a bit of a backlash on twitter from @garik16 as it did when I used this technique not long ago.

While I think that removing outliers has to be done with great caution and consideration it is also important to acknowledge that outlier analysis can be incredibly valuable tool in understanding what is going on. Teams aren’t built randomly and talent isn’t evenly distributed across the league. Talent differences across teams may result in different statistical patterns across teams. Different organizations have different philosophies on players and playing styles and this too may impact statistical patterns. As I have said before, we know that teams can manipulate statistical patterns by changing their playing style based on the score of the game (score effects are a well researched and fully accepted concept in hockey analytics) so it isn’t difficult to envision that various other statistical patterns could be altered by organizational or coaching philosophies. As statistical analysts we have to be open to this and not just apply a statistical model, crank out the results, and settle on hard and fast conclusions. We need to spend the time to understand the underlying data too.

I have spent a significant portion of my career working on air pollution research with some world-renowned scientists. Many years ago one not long after I finished University and just embarking on my career I was conducting some research on the relationship between weather patterns and air pollution. While doing this research a research scientist that I highly respect once told me that often the most interesting things can be learned when we study outliers. For this area of research typical weather patterns resulted in typical pollution levels but the study of outliers (atypical weather patterns) can really highlight the intricate relationship between weather patterns and air pollution.

Hockey isn’t baseball where there are a series of one-on-one battles that can be relatively easily incorporated into a statistical model because the only real factors involved are the talent levels of each player in the one-on-one battle. Unfortunately this isn’t how hockey works. Hockey is more like weather patterns where everything is interdependent on everything else and thus is very difficult to model. Sure, there are prevailing weather norms but occasionally outlier events happen like hurricanes or blizzards. It is these outliers that are the most interesting and most researched weather phenomena. Compared to a hurricane or a blizzard nobody really cares much about another 80F sunny day in Miami or a -5C January day in Ottawa. It’s just another day.

So, when I see someone suggest that you shouldn’t investigate how outliers affect underlying trends I get a bit defensive. If all you care about is what normally happens you’ll never truly understand the most interesting stuff. No NHL team strives to be ordinary, they strive to be elite and being elite, by definition, means being an outlier. If you want to be an outlier, you ought to do everything you can to understand what makes an outlier an outlier.

In one of Stimson’s charts he identified Chicago as the outlier team. Interestingly, I identified Chicago as an outlier team in my study on the relationship between Corsi and shooting percentage because they are one of the few teams that can post a good Corsi and an elevated shooting percentage. Furthermore, when it comes to elite NHL teams, Chicago would be front and center in the discussion. Is this a coincidence? Maybe. Or maybe it isn’t. It could be luck, it could be skill, or it could be organizational philosophy and/or coaching tactics but understanding why outliers exist is of critical importance. (Note: This is where I see the convergence of hockey analytics with traditional ‘hockey people’ like coaches and scouts. Analytics can identify trends and outliers to those trends and coaches and scouts can help assess the reason why those trends and outliers occur.)

Ultimately, for any NHL franchise who strives to be an elite team (which they all should) it means they are striving to be an outlier. Without understanding what make an outlier how can you expect to be one and you’ll only understand what makes an outlier by studying outliers independently from the underlying typical trend. This needs to be done with caution and care as to not just reinforce preconceived beliefs, but by not doing outlier analysis you are not fully understanding what is happening.


Dec 082014

I have tackled the subject of on-ice shooting percentage a number of times here but I think it is a subject that has been under researched in hockey analytics. Historically people have done some split half comparisons found weak correlations and written it off as a significant or useful factor in hockey analytics. While some of the research has merit, a lot of the research deals with too small of a sample size to get any really useful correlations. Split-half season correlations with majority of the players is including players that might have 3 goals int he first half and 7 in the second half and that is just not enough to draw any conclusions from. Even year over year correlations have their issues and in addition to smallish sample sizes it suffers problems related to roster changes and how roster changes impact on-ice shooting percentages. Ideally we’d want to eliminate all these factors and get down to actual on-ice shooting percentage talent factoring out both luck/randomness and roster changes.

Today @MimicoHero posted an article discussing shooting percentage (and save percentage)  by looking at multi-year vs multi-year comparisons. It’s a good article so have a read and I have written many articles like this in the past. This is important research but as I eluded to above, year over year comparisons suffer from issues related to roster change which potentially limit what we can actually learn from the data. People often look at even/odd games to eliminate these roster issues and that is a pretty good methodology. Once in the past I took this idea to the extreme and even used even/odd seconds in order to attempt to isolate true talent from other factors (note that subsequent to that article I found a bug in my code that may have impacted the results so I don’t have 100% confidence in them. I hope to revisit this in a future post to confirm the results.). This pretty much assures that the teammates a player plays with and the opponents they play against and the situations they play in will be almost identical in both halves of the data. I hope to revisit the even/odd second work in a future post to confirm and extend on that research but for this post I am going to take another approach. For this post I am going to focus solely on shooting percentage and use an even/odd shot methodology which should do a pretty good job of removing roster change effects as well.

I took all 5v5 shot data from 2007-08 through 2013-14 and for each forward I took their first 200, 400, 600, 800, 1000, 1200, 1400, 1600, 1800 and 2000 shots for that they were on the ice for. This allowed me to do 100 vs 100 shot, 200 vs 200 shot, … 1000 vs 1000 shot comparisons. For comparison sake, in addition to even/odd shots I am also going to look at first half vs second half comparisons to get an idea of how different the correlations are (i.e. what the impact of roster changes is on a players on-ice shooting percentage). Here are the resulting correlation coefficients.

Scenario SplitHalf Even vs Odd NPlayers
100v100 0.186 0.159 723
200v200 0.229 0.268 590
300v300 0.296 0.330 502
400v400 0.368 0.375 443
500v500 0.379 0.440 399
600v600 0.431 0.481 350
700v700 0.421 0.463 319
800v800 0.451 0.486 285
900v900 0.440 0.454 261
1000v1000 0.415 0.498 222

And here is the table in graphical form.


Let’s start with the good news. As expected even vs odd correlations are better than first half vs second half correlations though it really isn’t as significant of a difference as I might have expected. This is especially true with the larger sample sizes where the spread should theoretically get larger.

What I did find a bit troubling is that correlations seem to max out at 600 shots vs 600 shots and even those correlations aren’t all that great (0.45-0.50). In theory as sample size increases one should get better and better correlations and as they approach infinity they should approach 1.00. Instead, they seem to approach 0.5 which had me questioning my data.

After some thought though I realized the problem was likely due to the decreasing number of players within the larger shot total groups. What this does is it restricts the spread in talent as only the top level players remain in those larger groups. As you increase the shot requirements you start weeding out the lesser players that are on the ice for less ice time and fewer shots. So, while randomness decreases with increased number of shots so does the spread in talent. My theory is the signal (talent) to noise (randomness) ratio is not actually improving enough to see improving results.

To test this theory I looked at the standard deviations within each even/odd group. Since we also have a definitive N value for each group (100, 200, 300, etc.) and I can calculate the average shooting percentage it is possible to estimate the standard deviation due to randomness. With the overall standard deviation and an estimated standard deviation of randomness it is possible to calculate the standard deviation in on-ice shooting percentage talent. Here are the results of that math.

Scenario SD(EvenSh%) SD(OddSh%) SD(Randomness) SD(Talent)
100v100 2.98% 2.84% 2.67% 1.15%
200v200 2.22% 2.08% 1.91% 1.00%
300v300 1.99% 1.87% 1.56% 1.14%
400v400 1.71% 1.70% 1.35% 1.04%
500v500 1.56% 1.57% 1.21% 1.00%
600v600 1.50% 1.50% 1.11% 1.01%
700v700 1.35% 1.39% 1.03% 0.90%
800v800 1.35% 1.33% 0.96% 0.93%
900v900 1.24% 1.26% 0.91% 0.86%
1000v1000 1.14% 1.23% 0.86% 0.81%

And again, the chart in graphical format.


The grey line is the randomness standard deviation and it flows as expected, decreasing in a nice manner. This is a significant driver of the even and odd standard deviations but the talent standard deviation slowly falls off as well. If we call SD(Talent) the signal and SD(Randomness) as the noise then we can plot a signal to noise ratio calculated as ST(Talent) / SD(Randomness).


What is interesting is that the signal to noise ration improves significantly up to 600v600 then it sort of levels off. This is pretty much in line with what we saw earlier in the first table and chart. After 600v600 we start dropping out the majority of the fourth liners who don’t get enough ice time to be on the ice for 1400+ shots at 5v5. Later we start dropping out the 3rd liners too. The result is the signal to noise ratio flattens out.

With that said, there is probably enough information in the above charts to determine what a reasonable spread in on-ice shooting percentage talent actually is. Specifically, the yellow SD(Talent) line does give us a pretty good indication of what the spread in on-ice shooting percentage talent really is. Based on this analysis a reasonable estimate for one standard deviation in shooting percentage talent in a typical NHL season is probably around 1.0% or maybe slightly above.

What does that mean in real terms (i.e. goal production)? Well, the average NHL forward is on the ice for ~400 5v5 shots per season. Thus, a player with an average amount of ice time that shoots one standard deviation (I’ll use 1.0% as standard deviation to be conservative) above average would be on the ice for 4 extra goals due solely to their on-ice shooting percentage. Conversely an average ice time player with an on-ice shooting percentage one standard deviation below average would be on the ice for about 4 fewer goals.

Now of course if you are an elite player getting big minutes the benefit is far greater. Let’s take Sidney Crosby for example. Over the past 7 seasons his on-ice shooting percentage is about 3.33 standard deviations above average and last year he was on the ice for just over 700 shots. That equates to an extra 23 goals due to his extremely good on-ice shooting percentage. That’s pretty impressive if you think about it.

Now compare that to Scott Gomez whose 7-year shooting percentage is about 1.6 standard deviations below average. In 2010-11 he was on the ice for 667 shots for. That year his lagging shooting percentage talent an estimated 10.6 goals. Imagine, Crosby vs Gomez is a 33+ goal swing in just 5v5 offensive output.

(Yes, I am taking some liberties in those last few paragraphs with assumptions relating to luck/randomness, quality of team mates and what not so not all good or bad can necessarily be attributed to a single player or to the extent described but I think it drives the point, a single player can have a significant impact through on-ice shooting percentage talent alone).

In conclusion, even after you factor out luck and randomness, on-ice shooting percentage can player a significant role in goal production at the player level and, as I have been saying for years, must be taken into consideration in player evaluation. If you aren’t considering that a particular player might be particularly good or particularly bad at driving on-ice shooting percentage you may not be getting the full story.

(In a related post, there was an interesting article on Hockey Prospectus yesterday looking at how passing affects shooting percentage which supports some earlier findings that showed that good passers are often good at boosting teammates on-ice shooting percentage. Of course I have also shown that shots on the rush also result in higher shooting percentage so to the extent that players are good at generating rush shots they should be good at boosting their on-ice shooting percentages).


Sep 242014

Today apparently there was some discussion about the Avalanche and their non-interest in hockey analytics. In that discussion Corey Pronman wrote the following tweet:


I have seen the above logic from time to time. I think it dates back to something Gabe Desjardins wrote many years ago. I find the logic very odd though. Let me explain.

Let’s assume that the numbers are true. According to my math, that leaves 25% unaccounted for. I don’t really consider 25% insignificant but it is actually more significant than that.

Luck, or I prefer the term randomness, is a component that is outside the control of a general manager, a coach, a player or anyone else that could potentially influence the outcome of the game. Thus it is pointless to bring luck into the equation.  All that management and players for an NHL team really needs to worry about is what they can control. That is the non-luck fraction of winning or the other 60%.

Now, if Corsi is 35% of winning overall then it accounts for 58% of the controllable aspect of winning. That leaves 42% of what is controllable unaccounted for. If I were an owner of an NHL team, or an owner of a business of any kind, and my general manager told me that we are going to largely ignore 42% of  the controllable factors that lead to positive outcomes I’d be firing that general manager on the spot. It simply isn’t acceptable business practice to ignore 42% of what is within your control that produces good outcomes.

Here is the the real kicker though. The estimate that Corsi explains 35% of wins is based on historical data (and probably from several years ago). It does not necessarily mean it will be that way in the future. As teams become more aware of Corsi and possession it is certainly conceivable that the disparity across teams in corsi shrinks and thus the importance of Corsi as a differentiator among teams and as predictor of winning shrinks. If teams switch focus to Corsi those other factors might be the great differentiator of team talent and be the better predictor of success. It is easy to hop on the Corsi bandwagon now. The forward thinking teams and forward thinking hockey analytics researchers are those researching that other 42% to some significant degree.

Now, if you are a hockey analytics researcher raise your hand if you have spent ~60% of your research and analysis time on Corsi related issues and ~40% of your research time on non-Corsi related issues. If you are honest I suspect very few of you have raised your hand. The honest truth is those other factors have been unfairly downplayed and in my opinion that is very unfortunate.


Jul 232014

Tyler Dellow has an interesting post on differences between the Kings and Leafs offensive production. He comes at the problem from a slightly different angle than I have explored in my rush shot series so definitely go give it a read. These two paragraphs discuss a theory of Dellow’s that is interesting.

That’s the sort of thing that can affect a team’s shooting percentage. To take it to an extreme, teams shot 6.2% in the ten seconds after an OZ faceoff win this year; the league average shooting percentage at 5v5 is more like 8%. Of course, when you win an offensive zone draw, you start with the puck but the other team has five guys back and in front of you.

I wonder whether there isn’t something like that going on here that explains LA’s persistent struggles with shooting percentage (as well as those of New Jersey, another team that piles up Corsi but can’t score – solving this problem is one of the burning questions in hockey analytics at the moment). It’s a theory, but one that seems to fit with what Eric’s suggested about how LA generates the bulk of their extra shots. It’s hard for me to explain the Leafs scoring so many more goals in the first 11 seconds after a puck has been carried in, particularly given that I suspect that LA, by virtue of their possession edge, probably enjoyed many more carries into the offensive zone overall.

Earlier today I posted some team rush statistics for the past 7 and past 3 seasons. Let’s look in a little more detail how the Leafs, Kings and Devils performed over the past 3 seasons.

Team RushGF RushSF OtherGF OtherSF RushSh% OtherSh% Rush%
New Jersey 45 540 103 1675 8.33% 6.15% 24.4%
Toronto 66 523 128 1675 12.62% 7.64% 23.8%
Los Angeles 53 609 112 1978 8.70% 5.66% 23.5%

The Leafs scored the most goals on the rush despite the fewest rush shots due to a vastly better shooting percentage (nearly 50% better than the Devils and Kings) on the rush. They do not generate more shots on the rush, but do seem to generate higher quality shots.

The Kings generate by far the most shots in non-rush situations but have the poorest shooting percentage and thus do not score a ton of goals. The Devils don’t generate many non-rush shots and don’t have a great non-rush shooting percentage either and thus posted the fewest goals. The Leafs have had the same number of shots as the Devils but a significantly higher shooting percentage than the Devils and thus scored significantly more non-rush goals.

The Leafs scored 34% of their goals on the rush compared to 32% for the Kings and 30% for the Devils.

Are the Leafs a good rush team? Well, only Boston has scored more 5v5 road rush goals than the Leafs so probably yes but it is mostly because of finishing talent, not shot generating talent. They are 4th last in 5v5 road rush shots.

The Ducks have very similar offense to the Leafs. They don’t get many rush shots but post a really high rush shooting percentage. Anaheim generate a few more non-rush shots than the Leafs but they are very similar offense.

The Kings are a slightly better rush team than the Devils but neither are good and both are weak shooting percentage teams regardless of whether it is a rush or non-rush shot. The Kings make up for this though by generating a lot of shots from offensive zone play where as the Devil’s don’t.


Columbus Blue Jackets and Rush Shots

 Uncategorized  Comments Off on Columbus Blue Jackets and Rush Shots
Jul 152014

Before I get into rush shots of individual players I am going to look at some teams. I am starting with the Columbus Blue Jackets which was suggested for me to look at by Jeff Townsend who was interested to see impact the decline of Steve Mason and then the transition to Bobrovsky had. Before we get to that though, let’s first look at the offensive side of things (and if you haven’t read my introductory pieces on rush shots read them here, here and here).


The League data is league average over the past 7 seasons.

There is a lot of randomness happening here, particularly the rush shot shooting percentages. This could be due to randomness as sample size for single season 5v5 road data is getting pretty small, particularly for rush shot data. Having looked at a number of these charts I think sample size is definitely going to be an issue. They key will be looking for trends above and beyond the variability.

Now for save percentages.


This chart is definitely a little more stable. Steve Mason’s excellent rookie season was 2008-09 where he actually had a below average non-rush 5v5road save percentage but an above average rush save percentage. Columbus never again posted a rush save percentage anywhere close to league average until this past season. Interestingly, despite Bobrovsky’s good season in 2012-13 his 5v5road save percentage that year was somewhat average (at home it was outstanding though which just goes to show you how variable these things can be).

Let’s take a look at the percentage of shots that were rush shots for and against.


Not really sure what to read into that, but I thought I toss it out there for you.

Something that I haven’t looked at before is PDO which is the sum of shooting and save percentages. There is no reason we can’t do this for rush and non-rush shots so here is what it looks like for Columbus.


Again, I am not sure what we can read into this PDO table. PDO is kind of an odd stat in my opinion. PDO typically gets used as a “luck” metric which it can be if it deviates from 100.0% significantly which is certainly the case for a couple of seasons of Rush Shot PDO.

I am still trying to figure out how useful any of this rush/non-rush information is. Certainly I think we hit some serious sample size issues when looking at a single seasons worth of road-only data and I think that puts some of the usefulness in question. I have done some year over year correlations and truthfully they aren’t very good. I think that is largely sample size related but I still think playing style and roster turnover can have significant impacts too. All that said, there is a clear difference between the difficulty of rush and non-rush shots and teams that can maximize the number of rush shots they take and minimize the number of rush shots against will be better off.


Jul 102014

Yesterday I introduced the concept of rush shots which are basically any shot we can identify as being a shot taken subsequent to a rush up the ice which can be determined by the location of previous face off, shot, hit, giveaway or takeaway events. If you haven’t read the post from yesterday go give it a read for a more formal definition of what a rush shot is. Today I am going to take a look at how rush shots vary when teams are leading vs trailing as well as investigate home/road differences as arena biases in hits, giveaways and takeaways might have a significant impact on the results.

Leading vs Trailing

One hypothesis I had is that a team defending a lead tends to play more frequently in their own zone and thus have the potential to generate a higher percentage of shots from the rush. Here is a table of leading vs trailing rush shot statistics.

Game Situation Rush Sh% Other Sh% Overall Sh% % Shots on Rush
Leading 10.43% 8.03% 8.62% 24.3%
Trailing 9.36% 7.15% 7.63% 22.0%
Leading-Trailing 1.07% 0.89% 0.98% 2.28%

As expected, teams get a boost in the percentage of overall shots that are rush shots when leading (24.3%) compared to when trailing (2.28%). This higher percentage of shots being rush shots would factor in to the higher shooting percentages but it actually doesn’t seem to be all that significant. The more significant impact still seems to be that teams with the lead experience boosts in shooting percentage on both rush and non-rush shots. The hypothesis that teams have a higher shooting percentage when leading due to the fact that they have more shots on the rush doesn’t seem to be true. It’s just that they shoot better. Note that empty net situations are not considered and thus the shooting percentages when leading are not a result of empty net goals.

 Home vs Road

My concern with home stats is the various arena game recorders dole out hits, giveaways and takeaways at different rates. I determine what is a rush and what isn’t based in part on those events so there is the potential of significant arena biases in rush shot stats. To investigate I looked at the percentage of shots that were rush shots at home and on the road for each team. Here is what I found.


That is about as conclusive as you can get. The rush shot percentage at home is far more variable than on the road with higher highs and lower lows. It is possible that last change line matching usage tactics that coaches can more easily employ at home could account for some of the added variability but my guess is it is mostly due to arena scorer biases. From the chart above I suspect Buffalo, Minnesota, New Jersey  and Pittsburgh don’t hand out hits, giveaways and takeaways as frequently as other arenas. This chart takes a look at last years real time stats (the above chart is for last 7 seasons combined).


Most teams have more hits+giveaways+takeaways on home ice than on the road. The teams that have more on the road than at home are Buffalo, Minnesota, New Jersey, Pittsburgh and St. Louis. Despite comparing a 7-year chart with a 1-year chart the two charts seem to align up fairly well. There does seem to be significant arena biases in rush shot statistics so when looking at team and player stats it is certainly best to consider road stats only.


Jul 092014

I have been pondering doing this for a while and over the past few days I finally got around to it. I have had a theory for a while that an average shot resulting from a rush up the ice is more difficult than a shot than the average shot that is generated by offensive zone play. It makes sense for numerous reasons:

  1. The rush may be an odd-man rush
  2. The rush comes with speed making it more difficult for defense/goalie to defend.
  3. Shots are probably take from closer in (aside from when a team wants to do a line change rarely do they shoot from the blue line on a rush).

To test this theory I defined a shot off the rush as the following:

  • A shot within 10 seconds of a shot attempt by the other team on the other net.
  • A shot within 10 seconds of a face off at the other end or in the neutral zone.
  • A shot within 10 seconds of a hit, giveaway or takeaway in the other end or the neutral zone.

I initially looked at just the first two but the results were inconclusive because the number of rush events were simply too small so I added giveaway/takeaway and hits to the equation and this dramatically increased the sample size of rush shots. This unfortunately introduces some arena bias into the equation as it is well known that hits, giveaways and takeaways vary significantly from arena to arena. We will have to keep this in mind in future analysis of the data and possibly consider just road stats.

For now though I am going to look at all 5v5 data. Here is a chart of how each team looked in terms of rush and non-rush shooting percentages.


So, it is nice to see that the hypothesis holds true. Every team had a significantly higher shooting percentage on “rush” shots than on shots we couldn’t conclusively define as a rush shot (note that some of these could still be rush shots but we didn’t have an event occur at the other end or neutral zone to be able to identify it as such). As a whole, the league has a rush shot shooting percentage of 9.56% over the past 7 seasons while the shooting percentage is just 7.34% on shots we cannot conclusively define as a rush shot. Over the 7 years 23.5% of all shots were identified as rush shots while 28.6% of all goals scored were on the rush.

In future posts over the course of the summer I’ll investigate rush shots further including but not limited to the following:

  • How much does the frequency of rush shots drive a teams/players overall shooting/save percentages?
  • Are score effects on shooting/save percentages largely due to increase/decrease in rush shot frequency?
  • Are there teams/players that are better at reducing number of rush shots?
  • Can rush shots be used to identify and quantify “shot quality” in any useful way?
  • How does this align with the zone entry research that is being done?



See follow-up posts:

Jul 042014

The other day I put up a post on Mike Weaver’s and Bryce’s Salvador’s possible ability to boost their goalies save percentage and I followed it up with a post on the Maple Leafs defensemen where we saw Phaneuf, Gunnarsson, Gleason and Gardiner all seemingly able to do so as well while Robidas had the reverse effect (lowering goalie save percentage). This got some fight back from the analytics community suggesting this is not possible. My question to them is, why not?

Their answer is that if you do year over year analysis of a players on-ice save percentage or a year over year analysis of a players on-ice save percentage relative to their teams you will find almost no correlation. While this is true I claim that this is not sufficient to prove that such a talent does not exist. Here is why.

We Know Players Can and Do Impact Save %

The most compelling argument that players can and do impact save % is that we see it happening all the time and it is fully accepted among the hockey analytics community. It is known as score effects. Score effects are a well entrenched concept in hockey analytics.  It is why we often look at 5v5 “close” or 5v5 tied statistics instead of just 5v5 statistics. Generally speaking, the impact score effects have is that the trailing teams usually experiences an increase in shot rate along with a decrease in shooting percentage while the team protecting the lead experiences a decrease in shot rate but an increase in shooting percentage. The following table shows the Boston Bruins shooting and save percentages when tied, leading and trailing over the past 7 seasons combined.

. Tied Leading Trailing
Shooting% 7.27% 9.14% 7.66%
Save% 93.36% 93.86% 92.53%

The difference in the Bruins save percentage between leading and trailing is 1.33%. This is the difference between a .923 save percentage goalie and a .910 save percentage goalie which is the difference between an elite goalie and a below average goalie. That is not insignificant. Is this the goalies fault or does it have something to do with the players in front of him? The latter seems most likely. It makes sense that when protecting the lead the players take fewer risks in an attempt to generate offense and in return give up fewer good scoring chances against albeit maybe more chances in total. Conversely, the team playing catch up take more offensive risks so they end up giving up more quality scoring chances against. This is reflected in their teams save and shooting percentages when leading and trailing.

So, now if a team can play a style that boosts the team save percentage when they are protecting a lead, why is it so inconceivable that a player could see the same impact in his on-ice save percentage if that player plays that style of hockey all the time? If Mike Weaver and Bryce Salvador play the same style all the time that teams play when protecting a lead, why can they not boost on-ice save percentage? There is no reason they can’t.

It is Difficult to Detect because Individual Players Don’t Have a lot of Control of Outcomes

The average player’s individual ability to influence of what happens on the ice is actually fairly small as there are also 9 other skaters and 2 goalies on the ice with him. At best you can say the average player has a ~10% impact on outcomes while he is on the ice. That isn’t much. Last week James Mirtle tweeted a link to Connor Brown’s page as evidence why +/- is a useless statistic. Over the course of three OHL season’s Brown’s +/- went from -72 to -11 to +44. I suggested to Mirtle that if this is the criteria for tossing out stats we can toss out a lot of stats including corsi% because most stats are highly team/linemate dependent. When challenged that this dramatic of reversal is not seen in corsi% I cited David Clarkson as an example.  In 2012-13 Clarkson was 4th in CF% but in 2013-14 he was 33rd (of 346) in CF%. From one year to the next he went from 4th best to 14th worst. Why is this? WEll, Clarkson essentially moved from playing with good corsi players on a good corsi team to playing with bad corsi players on a bad corsi team. No matter how much puck possession talent Clarkson has (or hasn’t) his talent doesn’t dominate over the talent level of the 4 team mates he is on the ice with.

Now think about how many players change teams from one year to the next and think about how many players get moved up and down a line up and change line mates from one season to next. It is not an insignificant number. TSN’s UFA tracker currently has 109 UFA’s getting signed starting July 1st, the majority of them changing teams. There are only ~800 NHL players (regulars and depth players) in a season so that is pretty significant turnover. Some teams turn over a quarter to half their line up while others stay largely the same. With that much roster turnover and with so little ability for a single player to drive outcomes it should be expected that the majority of statistics see relatively high “regression”. Regression doesn’t mean lack of individual talent though.

Think of this scenario. We have a player with an average ability to boost on-ice save percentage and he has been playing on a team with a number of players who are good at boosting on-ice save percentage but generally speaking he doesn’t play with those players. Under this scenario it will appear that the player is poor at boosting on-ice save percentage because he is being compared to  players who are good at it. Now that player moves to another team who isn’t very good at boosting on-ice save percentage. Now that same average player will look like he is a good player because he has a better on-ice shooting percentage than his teammates. The result is little year over year consistency but that doesn’t mean there aren’t talent differences among players.

Hockey is not like baseball which is a series of one-on-one matchups between pitcher and batter or isolated attempts to make a fielding play on a hit ball. Outcomes in hockey are completely interdependent on up to 12 other players on the ice. QoT is the largest driver of a players statistics in hockey. Only when we factor out QoT completely can we truly be able to identify every players talent level for any metric we measure. This is a kind of like the chicken and an egg problem though because to identify a players talent level we need to know the talent level of their team mates which in turn required knowledge of his own talent level. We can’t just look at year over year regression to isolate talent level.


The “team” aspect in hockey is more significant than any other sport and any particular players statistics are largely driven by the quality of his team mates. Even more than teammates, style of play can be a significant factor in a players statistics. The quality of the players that a particular player plays with is a function of both the team the plays on and the role (offensive first line vs defensive third line) he is playing on the team and this is maybe the greatest driver of a players statistics. This is why David Clarkson can be a Corsi king in New Jersey and a Corsi dud in Toronto. It also accounts for why James Neal can be a 25 goal guy playing on the first line in Dallas to a 40 goal guy in Pittsburgh (and probably back to a 25 guy guy in Nashville next year).  This also accounts for why year over year correlation in many stats is not very good despite there being measurable differences in the talent that that stat is measuring. Significant statistical regression is not sufficient, in my opinion, to conclude insignificant controllable talent if no significant attempt to completely isolate individual contribution to team results has been successfully made.

Just for fun, here is a chart of Lidstrom’s on-ice save percentage vs team save percentage. It is pretty outstanding that an offensive defenseman can do this too.



Apr 292014

It seems every time a new hockey person gets hired these days they will get asked “do you believe in hockey analytics?” It started with Trevor Linden in Vancouver. Then Brendan Shanahan in Toronto. And today Brad Treliving in Calgary. Nichols on hockey has a good rundown on both Treliving’s and Burke’s response to the question today so go give it a read.

As we all know, Brian Burke is an analytics skeptic to say the least. A popular Brian Burke quote is the following:

“Let’s get the record straight on that too. The first analytics systems I see that’ll help us win, I’ll buy it. I’ll pay cash so that no one else can use it. I’m not a dinosaur on that.”

What Burke gets wrong here is that analytics is not a “system” you can buy but rather it is a thought process and a way of doing business. Walmart is famous for using analytics to maximize the profits of their retail operation by knowing their customers buying habits, knowing what their customers will buy, how much they will buy and when they will buy it based on everything from the weather to the economy. Analytics is a huge part of their success. That said, there is no analytics “system” that another retailer can purchase off the shelf that will allow them to do the same. It isn’t a system that makes Walmart so successful it is the way they use analytics to operate their business that permeates the entire operation that makes them successful. Every retail operation has a different customer base. Every retail situation has a different set of products they sell. Every retail situation has a different cost structure. There is no single “system” that can be applied that will guarantee retail success. That doesn’t mean that every retail operation can’t benefit from analytics because analytics is a way of doing business. It is the mindset of wanting to know as much as you can and applying unbiased analyticical techniques to that knowledge to drive decision making. It is the mindset of wanting to know as much about your customers buying habits as you possibly can. It is the mindset of wanting to know what your customers will want to buy and when and why. It is a mindset of knowing how many employees you need on staff at a given time to maximize sales and profits. It is about wanting to know how long a line up customers will tolerate before the leave and make a purchase elsewhere. Analytics is a way of thinking that permeates throughout your organization, it is not a “system” that you can buy and apply.

I don’t know the extent that NHL teams are using hockey analytics but I get the feeling that there are very few that are doing so in a real serious way. Being a highly analytical person I may be biased but to me an NHL team that truly adopts hockey analytics would see the idea of analytics permeate throughout the organization. Analytics should be an important driver of coaching tactics and decisions. It should be an important driver of scouting and player evaluation. It should be an important driver of team building. It should be an important driver of maximizing salary cap commitments. It also should not be one-directional as I firmly believe hockey analytics can benefit significantly from the hockey knowledge of players, coaches, general managers and scouts to improve and test analytical techniques. I have my doubts that there are many NHL organizations that have truly adopted hockey analytics when defined in that way. Some may be dabbling, few are truly adopting.

Interestingly though, I suspect there isn’t one NHL organization that doesn’t use analytics in a significant way on the business side of the organization to do everything from setting ticket, beer and hot dog prices, to setting advertising rates to evaluating their sales staff effectiveness. I am certain analytics permeates through the business side of an NHL organization in a significant way so it is kind of surprising there is any resistance to it on the hockey side.