Persistence of Sv%RelTM and failure of statistical models in hockey analytics

 Uncategorized  Comments Off on Persistence of Sv%RelTM and failure of statistical models in hockey analytics
Jun 032015

Over the last several days I have tweeted several times (here, here and here) about my Sv%RelTM statistic which can be found on which generated some interest from my followers as well as some skeptics.


The issue I have with that statement and others like it is that it uses a simple statistical model, applies it to all players, and then draws conclusions about all players based on the results without actually really understanding what the model is telling us or understanding all the inherent problems with measuring players ability to impact shot quality against.

Continue reading »

Is Hockey Analytics altering outcomes yet?

 Uncategorized  Comments Off on Is Hockey Analytics altering outcomes yet?
Apr 262015

Hockey analytics is well behind analytics in other sports, particularly baseball, but we are now several years into what I will call modern (or current) hockey analytics which has largely focused on possession statistics such as Corsi and Fenwick. Last summer we even saw a number of teams publicly adopt analytics by picking up some prominent people from the public domain. Toronto, Edmonton, Carolina, Florida, and New Jersey to name a few. Results for those teams have clearly been mixed thus far but the greater question is whether hockey analytics, and possession analytics in particular, has had a greater impact on the game than just those few teams. I hope to answer some of those questions today.

One of the reasons why possession statistics such as Corsi became so popular is that it has shown that good possession teams often do well and it has also been identified as an undervalued skill as Eric Tulsky wrote about a couple of years ago. Contracts and salaries were generally given by teams to reward skills such as shooting percentage more than possession skills and thus possession skills were an undervalued talent. Teams could tap into this undervalued talent by getting good possession players at a fraction of the cost of good shooting percentage players. I warned that focusing too much on possession statistics is potentially harmful in the long run as it could result in players altering their playing style at the expense of what really matters, out scoring the opposition. I have shown that there is likely at least a loose inverse relationship between Corsi and shooting percentage implying that boosting one Corsi often has the negative consequences of reducing ones shooting percentage. I did this by looking at the impacts of coaching changes on Corsi and Shooting percentage and looking at the relationship between team CF% and Sh% when extreme outliers are removed.

Continue reading »

Is 4v4 overtime hockey a crap shoot we can or should ignore?

 Uncategorized  Comments Off on Is 4v4 overtime hockey a crap shoot we can or should ignore?
Apr 132015

Since the Los Angeles Kings have been eliminated from the playoffs there has been a lot of discussion about why a team with such a good possession game failed to make the playoffs. This included my article from yesterday which generated a fair amount of discussion as well. A lot of the discussion can be summarized by the following tweet by Sunil Agnihotri referencing a comment by Walter Foddis.

The last paragraph is the one that interests me most.

“The substantive reason for LA not making the playoffs is the OT system, which does not reflect team strength. Statistically, OT outcomes have been shown to be a crap shoot. LA was unlucky in OT”

The fact that LA went 1-7 during overtime play does in fact mean that they were unlucky during OT play. They are a better team than that for sure (every team is expected to do better than that). OT results over the course of a single season are extremely random and thus one could consider them a crap shoot. The challenge I have is just because something is highly variable does that mean it is meaningless in our evaluation? Being unlucky in over time does not mean you are unlucky overall.

I’d hazard a guess that outcomes of the first 5 minutes of the second period for games that are played on a Thursday are highly random too. If a team missed the playoffs and had a terrible goal differential during the first 5 minutes of the second period in games that are played on a Thursday can we chalk up missing the playoffs to bad luck during the first 5 minutes of the second period in Thursday games? No, of course not. We don’t get to pick and choose what good luck or what bad luck we can blame results on. Just because we are more aware of bad luck that happens in overtime games doesn’t mean it is more important bad luck worthy of attributing blame to.

The reality of the situation is that unless you can be certain that the Kings OT bad luck is not offset by good luck during the remainder of the game you can’t blame the Kings missing the playoffs on their OT record.  I haven’t seen the complete luck analysis of the Kings season done to claim the Kings were unlucky during regulation and OT play as a whole so I am pretty reluctant to blame the Kings playoff miss on their OT record just yet.

The interesting question for me is whether 4v4 play is indicative of overall talent because if 4v4 hockey requires a completely different skill set then one could conclude that overtime play isn’t representative of true hockey talent. To answer this question I took the correlation between each teams 5v5close GF% over the past 8 seasons (to get large sample sizes though it would reduce the spread in talent) and compared it to their 4v4close GF% over the past 8 seasons (I used close since most 4v4 ice time is in OT and thus in close situations). Here are the results.


And the same for CF%.


Those correlations are good enough for me to consider that 5v5 skills are fairly transferable to 4v4 play and vice versa. Over small samples strange things happen, but to suggest that 4v4 play isn’t indicative of hockey skill and that is why one should ignore OT results is not valid either.

An interesting observation is that the slope on the CF% chart is almost exactly 1.0. The slope on the GF% chart is significantly higher than 1.0 which might indicate that 4v4 play is actually a better indicator of talent than 5v5 play (if you are good at 5v5 play you should be even better at 4v4 play). That said, if I force the intercept to zero the slop drops to 0.9958 or almost exactly even (and r^2 drops to 0.3123 with zero intercept) so maybe 5v5 and 4v4 are on par with each other. Regardless, this should at least alleviate Steve Burtch’s concern that poorer teams are more likely to score first during 4v4 play than during 5v5 play. I don’t believe that to be the case.

Now when we talk about shoot out record I think that it is safe to assume that the shoot out is a lot further from being representative of actual hockey talent than 4v4 play. There is probably not enough shoot out data to actually be able to do a similar analysis with any degree of confidence but I doubt there is much disagreement that the shoot out is a long way from being representative of real hockey.


Apr 122015

The other day I posted the following twitter comment after the Flames defeated the Kings to gain a playoff position while simultaneously eliminating the reigning Stanley Cup Champion Los Angeles Kings from the playoffs.

I posted this comment for two reasons. First because I think if you are being honest about evaluating possession analytics you have to consider the failures on an equal ground as the successes. I am certain that if the Kings defeated the Flames and ultimately made the playoffs over the Flames there would have been people that would use it as evidence that possession analytics is good at predicting future results. That would be a fair thing to do but you have to consider the failures too and possession analytics failed twice here, first with the Flames making the playoffs and second with the Kings missing. So, I made this comment because analytically it is the correct thing to do and I felt it needed to be said.

The other reason I made this comment was to see how people would react and to see whether people would react with fairness as explained above or in a defensive manner defending possession analytics and dismissing the Flames/Kings outcome as largely luck. For the most part the reaction was more subdued that I had thought but there were some jumping in defense of possession analytics including the following tweet from @67sound.

If you are relying on the LOS ANGELES KINGS to minimize the importance of possession metrics I don’t even know where to begin.

This is an over reaction because I didn’t actually try to minimize the importance of possession, I was just pointing out where it failed. If you follow me I use possession metrics all the time, I just think that there is too much consideration for when possession metrics succeed in predicting outcomes and too little consideration of when it fails and when other metrics succeed. I have talked about this before on a few occasions where people want to point out how well possession metrics are at predicting outcomes but not actually comparing the success rates against other predicting methodologies. In many instances possession statistics do a great job at predicting outcomes, but often goal based metrics actually do slightly better.

The follow up discussion to my tweet soon started to rationalize why the possession stats failed in predicting the Los Angeles Kings missing the playoffs.

Scott Cullen of wrote the following in his Statistically Speaking column about the Kings.

For starters, the Kings were 2-8 in shootouts and 1-7 in overtime games. Given the randomness involved in shootout results, that’s basically coming out on the wrong end of coin flips. 3-15 in overtime and shootout games, after going 12-8 the year before, is enough in tightly-contested standings, to come up short. Records in one-goal games tend to be unsustainable, but there’s enough of them in hockey that they make a huge difference in the standings.

Most of these are fair comments. The shootout record in almost completely random and not actually representative of how good they are at playing hockey (though I disagree with overtime records not being useful in evaluating how good the Kings are at playing hockey). With a bit better fortune the Kings likely would have made the playoffs and probably should have. The thing is though we all need to be careful not to use “luck” as a tool in confirmation bias as luck can be used to explain everything. Flames made the playoffs, write it off as good luck and move on without blinking an eye. They will regress next year, just watch. Kings missed the playoffs, write it off as bad luck and move on without blinking an eye. They will be better next year, just watch. A thorough review needs to be conducted, not just quickly write off anything that goes counter to our beliefs/predictions as luck.

The Kings missed the playoffs this year with 95 points. The previous four seasons they have had 100, 101 (prorated over 82 games), 95, and 98 points. So, on average the LA Kings have been a ~98 point team over the past 5 seasons. If they went 5-5 instead of 2-8 in shootouts that is exactly where they would have finished. For the most part this Kings team is what they have mostly been and what we probably should have expected. That is a good, but not elite, regular season team. Over these past 5 seasons they have finished 18th, 10th, 7th, 13th and 12th place overall. That actually compares somewhat poorly to the cross-town Anaheim Ducks who have finished 3rd, 2nd, 3rd, 25th, and 9th over the past 5 seasons. The Kings score adjusted Fenwick % over that time is 55.3% compared to the Ducks 50.3% and yet four of the five seasons the Ducks finished ahead of the Kings in the regular season. The reason for this is the Ducks have a 9.19 5v5close shooting percentage over the past four seasons compared to the Kings 6.69%. That difference is not luck. It’s a persistent repeatable skill that possession analytics doesn’t capture. Barring major off season roster moves no one should be predicting the Kings to end the regular season ahead of the Ducks next season. I suspect some will though just as was done for this season when using possession analytics to predict regular season point totals (Kings were predicted to get 107 points, Ducks 91).

So the Kings have been a pretty good but not a dominant regular season team. They have won the Stanley Cup twice during this period and have been a dominant possession team which has given us the perception that they are an elite team. Is it possible that we have generally over rated them because of their possession and post season success?  Maybe. Are they really a great team or just a good one that got hot when it mattered a couple times? It’s a question worth asking I think but if you just chalk up missing the playoffs this season to luck it is probably one you won’t be asking.

While we are on the subject of teams that were predicted to regress this season one such team is the Colorado Avalanche. A lot of people are tossing them out as an example of where possession statistics successfully predicted their failures this season. A major reason for predicting this regression was due to regression in their shooting and save percentages as Travis Yost of wrote prior to the season.

Using that regression for forecasting purposes, expect Colorado to shoot around 7.89 per cent for next year at evens and stop around 92.47 per cent of the shots.

Those are 5v5 shooting and save percentages Yost is talking about. In actual fact Colorado’s shooting hasn’t regressed this year as it is more or less identical to last seasons 5v5 shooting percentage (8.75% this season vs 8.80% last season). Save percentage has regressed almost what Yost predicted (92.52%) so he was right there (the role luck played in this is unknown though) but a major (and maybe the primary) reason for the Avalanche’s failures this season is they are playing a substantially worse possession game than last season. Colorado’s 5v5close CF% dropped from 47.4% last season to 42.9% this season which is a massive drop and likely the major reason for their failures this season. That drop can largely be attributed to letting two of their best CF% players leave in the off season – Paul Stastny and PA Parenteau and replacing them with poorer possession players in Iginla and especially Briere. Coaching may be a factor too. So some of the Avalanche’s failures this season can be attributed to a regression in save percentage but a significant part of it is due to poor off-season roster decisions.

Once again, we need to be careful with the “I told you they would regress” and leave it at that if the majority of their regression is due to factors you didn’t predict (to be fair Yost did mention that the Avalanche’s possession might drop a bit due to roster changes as well but it wasn’t the crux of his argument). It is quite possible, if not highly likely, the Avalanche is in fact a well above average shooting percentage team and we shouldn’t expect it to regress next season just as we shouldn’t expect the Ducks to either.

I need to reiterate here that it isn’t that I don’t believe that possession is an important aspect of the game. It is. It is why the Kings are good despite terrible shooting talent. It is why the Leafs are bad despite good shooting talent. What I really want to see and why I always point out where possession failed is because I want to ensure is that everyone evaluates possession fairly in the context of the complete game. I often hear things like “no one ever said possession was everything” and yet I frequently hear claims made without any mention of factors other than possession metrics. The Kings being a perfect example. Everyone assumed they were a great team that, barring massive bad luck, would make the playoffs and when they didn’t make the playoffs they started throwing out all the evidence of that bad luck. Truth is it was perfectly reasonable to predict that with even a little bit of bad luck the Kings could miss the playoffs though I don’t recall anyone really suggesting that (correct me if I am wrong though). It is also fair to suggest that if Colorado made smarter off season roster moves they could have been a playoff team again and not regress nearly to the extent they did but the discussion about the Avalanche revolved around bad possession, high PDO, they were lucky and will regress a lot. I want to see a better balance in hockey analytics as I think too much of hockey analytics is dominated by possession analytics. That is why I write tweets like the one about the Kings and Flames. There needs to be more balance.

So, my final words of advice is if you don’t believe that possession is everything (which apparently none of you do) you ought to be doing more than just conducting possession analytics. If you can honestly say you are doing that I congratulate you. If you can’t, well, what you do next is up to you.


Mar 212015

The other day on twitter I was called out by Sam Ventura who does some great work on Specifically he did not like my article on zone starts that I wrote the other day.

Let me step in here and say that I have never denied this. Offensive zone face offs are more likely to result in shots for the team on offense and less likely for the team on defense. Ok, that is settled, lets move on.


This is the crux of the problem. At the micro level yes, the location of face offs impacts outcomes. On the macro or aggregate level they are minimal. I tried to explain that here in more detail but maybe it didn’t come across too well so I will try again, in another way, with the war-on-ice tools. Let’s look at the Shea Weber picture from above Sam’s tweets above.


Ok, do there looks like a relationship. The higher the offensive zone start percentage the higher the CF%. Now, let’s take a look at the same chart but Offensive zone start percentage relative and see how the chart changes.


Significantly less correlation. Why? Because when the team is playing well the team as a whole generates more offensive zone starts. Not the other way around. We can also flip it around and look at how ZSO% compares to CF%Rel.


And to finish the display we can look at ZSO%Rel vs CF%Rel.


The relationship that Sam has observed is largely team driven, not Weber’s zone starts driven. There is a zone start impact on a players statistics but it is very minimal and for the majority of players can safely be ignored. The impact of the team is far more important. When the team does well it will result in a better CF% which in turn results in a higher ZSO% which is the reason for the high correlation. Zone starts don’t drive CF%, CF% drives zone starts. This makes total sense because the majority of zone starts will come after a shot on goal. The shot on goal produces the offensive zone face off, it isn’t the offensive zone face off that produces the shot on goal. We need to think of zone starts more as a result, not a cause.

On top of the team effect, I believe there is a style of play impact too which will take away even more correlation. When you play defensive hockey you often give up more shots. We see it in score effects all the time. Players who start more in the defensive zone are more likely to be the ones playing defensive hockey. This adds to the correlation as well and has nothing to do with zone starts.

Let me leave you with Phaneuf’s charts because his correlation in Sam’s charts was probably the greatest.





Again, a significant portion of the relationship disappears when you look at ZSO%Rel.

For me, the main evidence that zone starts don’t have a significant effect on a player’s overall statistics is if I remove the 45seconds after all offensive/defensive zone face offs (which basically ignores the entire shift) the majority of players have the same CF% +/- 1% and only a handful with heavy offensive or defensive zone starts have an effect in the +/- 1-2%. If removing all shifts that start with an offensive or defensive zone start does not dramatically impact a players overall statistics you simply cannot conclude that zone start bias plays a prominent role in driving a players overall statistics. Yes, for a particular shift it will, but not overall. Furthermore, the majority of that impact occurs in the first 10 seconds after a face off which is why my zone start adjusted data removes these 10 seconds which is something I showed over 3 years ago.

The critical point to remember in all of this is shots drive where face offs occur, where face offs occur do not drive shots. Coaching and line changes for face offs can impact overall player statistics a little but really not all that much.


Zone Starts, Corsi, and the Percentages

 Uncategorized  Comments Off on Zone Starts, Corsi, and the Percentages
Mar 162015

Matthew Coller has an interesting article on Puck Prospectus about Shea Weber and his poor Relative Corsi. His conclusion was that Weber’s poor Relative Corsi is largely due to his playing time with Paul Gaustad in which he posted a very poor CF% along with having a very heavy defensive zone start. His conclusion was that Weber’s poor Corsi with Gaustad is in a significant way caused by the heavy defensive zone start bias. This is a case of correlation not causation as I outlined in the comment section of that article. I recommend you take the time to read both the article and my comments because they are worthwhile reads.

My issue with the article is that I don’t believe that zone starts dramatically impact a players overall statistics as I explained here. I just haven’t seen any convincing evidence that zone starts would change a players CF% much more than 1-2% and for most players considering zone starts in player evaluation is not important. The relationship that Coller observed is important though because there is a clear relationship between zone starts and CF%. The relationship isn’t causal though. What the zone starts signify is a style of play. Players with a heavy defensive zone start bias are likely asked by the coach to play a defense first game and in many cases generating offense is not an important issue. The result is often a relatively minor deviation in a players CA/60 but a major deviation in a players CF/60 from the overall team stats. Let’s look at Paul Gaustad as an example. Gaustad has a OZone% of just 12.2% which means he has over seven times as many defensive zone starts as offensive zone starts. Here are how his Corsi stats compare to Nashville’s overall stats in 5v5close situations this season.

CF60 CA60
Nashville 60.0 53.0
Gaustad 38.8 51.9

As you can see, despite a heavy defensive zone start bias when Gaustad is on the ice the Predators actually gives up slightly fewer shots attempts against than they do overall but it is pretty close. Offensively though, when Gaustad is on the ice there is significantly less offense generated. If zone starts are the explanation one would probably expect there to be more balance between more shot attempts against and fewer shot attempts for but this is not the case. The likely explanation is that when Gaustad is on the ice the team is largely focused on not giving up a goal rather than generating offense. I suspect they do this largely by not giving up the puck and maintaining puck possession when you get possession. When you take a shot you are actually giving up control of the puck. You may regain control but so might the other team. If you are focused on preventing goals the best way to do that is to not give up the puck.

Lets take a quick look at Filip Forsberg who has played with a heavy offensive zone start bias indicating he is probably used in more offensive situations.

CF60 CA60
Nashville 60.0 53.0
Forsberg 69.4 53.2

Forsberg’s CA/60 is actually very similar to the team average and not all that different from Gaustad’s (higher actually) but his CF/60 is almost 80% higher. Again, this is unlikely to be zone start influenced but rather some combination of talent and playing style.

So, it seems that Ozone% is likely an indication of style of play, or at least an indicator of the main objective of the players on the ice, and we have seen that this can have a major impact on shot attempt rates.  I want to take this discussion one step further by looking at whether players can influence shooting/save percentages based on their style of play. Since shooting/save percentages are highly variable over small sample sizes such as the number of shots for/against taken while a player is on the ice during a single season we need to find ways to work around the randomness associated with the percentages. One way to do this is to group players based on similar attributes and take a group average. One of my favourite hockey analytics articles was this one written by Tom Awad in which he grouped similar players based on ice time and in doing so he found that shooting better than your opponent is a major factor in what makes good players good. In this case I have grouped players based on their OZone% and then took a group average Sh%RelTM and Sv%RelTM during 5v5close situations.

Ozone% Sh% RelTM Sv% RelTM
<30% -0.92% 1.26%
30-35% -0.43% 0.59%
35-40% -0.38% 0.80%
40-45% -0.18% -0.03%
45-50% -0.07% -0.07%
50-55% 0.48% 0.10%
55-60% 0.50% -0.16%
60-65% 0.52% 0.36%
65+% 0.24% -1.07%

Graphically here is what we get.


As you can see, there is a fairly strong relationship between zone starts and Sh%RelTM and Sv%RelTM. Players with a heavy defensive zone start will generally have a positive impact on his teams save percentage and a negative impact on his teams shooting percentage. Conversely players with a heavier offensive zone start bias will generally have a positive impact on his teams shooting percentage and negative impact on his teams save percentage. Some of this is likely player talent but a significant portion of it is likely driven by style of play as we saw with Corsi. It is next to impossible to identify these relationships by looking at individual players statistics because of the small sample sizes but when we group similar players together the relationship becomes clear and is a relatively strong one.

For perspective, Paul Gaustad’s OZone% over past three seasons with Nashville is 21.2% while his Sh%RelTM is -1.4 and his Sv%RelTM is +1.9.

The major takeaways I hope people get from this article are the following:

  1. Zone starts really do not have a significant impact on a players statistics.
  2. Zone starts can be an indicator of a players style of play and style of play can have a major influence on a players statistics (see my Coaching/Corsi dilemma article for more evidence of how style of play impacts Corsi).
  3. Players are able to, through talent and/or playing style, influence save and shooting percentages.
  4. Finding trends in shooting/save percentages can be difficult due to small sample size issues but that does not mean they do not exist. Hockey is a complex sport to analyze but being creative in grouping similar players can allow you to pull out valuable information that you otherwise could not.




The Coaching-Corsi dilemma

 Uncategorized  Comments Off on The Coaching-Corsi dilemma
Feb 242015

The other day I wrote about Bozak-Corsi dilemma which basically goes as follows:

  • The coaching change in Toronto from Carlyle to Horachek resulted in Tyler Bozak and the rest of the Leafs top line posting dramatically improved Corsi (5v5 tied CF%). Does this mean Bozak et al. suddenly got good or does it mean that Corsi is largely driven by playing style which one can change and thus the value of Corsi in player evaluation is greatly minimized.

Today I will look at the rest of the Leafs 5v5 Tied CF% from Carlyle to Horachek as well as three other coaching changes that occurred during the 2011-12 season. Those are Bruce Boudreau to Dale Hunter in Washington, Randy Carlyle to Bruce Boudreau in Anaheim and Terry Murray to Darryl Sutter in Los Angeles. Expanding the analysis to more players/teams will determine whether the Bozak-Corsi dilemma can be expanded to a more general Corsi-Coaching dilemma. First, lets summarize how the team 5v5 tied CF% changed due to these coaching changes.

Coaching Change CF% Pre CF% Post Difference
Kings – Murray to Sutter 50.7 58.5 7.8
Ducks – Carlyle to Boudreau 42.6 48.7 6.1
Leafs – Carlyle to Horachek 45.2 51.1 5.9
Capitals – Boudreau to Hunter 56.7 48.1 -8.6

The biggest positive impact was with the Kings while the Capitals change from Boudreau to Hunter had the biggest negative impact and the biggest change overall. All four coaching changes saw significant impact in the teams overall CF%. Let’s look at the teams in order listed above starting with the Kings.


Shown here are each players CF% under Murray (in Blue) and under Sutter (in Orange) with the difference shown in the grey bars. Shown are players with at least 50 5v5tied minutes under both coaches. The first four players saw their CF% jump by at least 10% and the next four by at least 7.5% and 11 of the 14 players saw their CF% jump at least 5%. Only Drew Dougty saw his drop but he already had a team best 59.3 CF% under Murray.

Here is the chart for the Ducks.


Every single player saw their CF% jump at least a little after the coaching change. Visnovsky’s and Getzlaf’s CF% jumped at least 10% while Perry, Sbisa and Lyudman jumped at least 7.5% and Selanne and Cogliano at least 5%.

Now for the Leafs.


JVR, Bozak, Rielly and Kessel saw at least a 10% boost in their CF% while Polak and Gardiner were at least 8%. No other player saw a jump of more than 3% and Komarov actually has a significant (10.8%) drop off.

Now, for a reversal of fortunes here is the Capitals chart.


For the Capitals every single player saw at least a drop of 3.9% (in fact only Wideman saw a drop off of less than 5%) with the first 6 guys seeing a drop of at least 10% and three more at least 9%.

In total there are 53 players in the charts above, 17 of them saw an absolute change of at least 10% while another 13 saw an absolute change of at least 7.5%. The average change was just shy of 8%. By looking at these four coaching changes it is safe to say that it is not unusual for a coaching change, or a change in playing style, to impact a players 5v5 CF% by 10% or more (nearly one third of the players above saw that big of a change). If a normal range for 5v5tied CF% is between 40% and 60% I think it is safe to suggest that half or more of that spread might be due to playing style and not individual talent. Furthermore there are almost certainly different playing styles on a single team (some lines certainly play more defenisve roles while others play more offensive roles) so even looking at CorsiRel stats might not factor out all coaching decisions. It certainly appears that Kessel-Bozak-JVR have seen a far more significant boost in their CF% relative to the rest of the team indicating that they likely change their playing style the most.

Above I looked at four coaching changes which had an average absolute impact of an 8% change on 5v5tied save percentage with nearly one third the players having an absolute change of greater than 10%. The majority of NHL players end a season with a 5v5tied CF% of between 40% and 60%. Based on the above analysis it is probably reasonable to believe that at least half of that spread can be attributed to variations in coaching/playing style which means the actual talent spread is probably no more than 45% to 55%, possibly even less.

Furthermore, I have previously shown that FF% (and thus likely CF%) loses predictive ability over longer periods of time at the team level. A significant reason for this is likely the higher number of coaching and roster changes that occur over a 4 or 5 year span. Every coaching change and every time a player changes teams (or even the line they play on) can potentially lead to a playing style change which could impact their CF% significantly. Of course, none of this should really come as much of a surprise as we already know playing style can have a major impact on CF% because we know all about score effects. On average a teams 5v5 CF% when they are leading is about 10% higher than their 5v5 CF% when they are. This 10% difference in CF% due solely to playing style dictated by the score lines up fairly well with what we have seen above where a 10% change in CF% due to a coaching change is not abnormal. The Corsi-Coaching Dilemma is real.

What all this means is that we need to consider playing style when we evaluate players because playing style can have a major impact on a players statistics. In fact, it may be the most important factor in a players Corsi statistics. This is something that we rarely do in analytics but failure to do so could result in a very flawed player evaluation. This is something the hockey analytics community really needs to address in future research.


The Bozak-Corsi Dilemma

 Uncategorized  Comments Off on The Bozak-Corsi Dilemma
Feb 222015

(Note: This is a cross post with You can find the original article here. I don’t normally cross post but this is relevant to Hockey Analytics as a whole, not mostly to Maple Leaf fans.)

A significant portion of modern hockey analytics revolves around Corsi (or SAT% as defined by the NHL), which is really nothing more than looking at which team takes more shot attempts. If you can out shoot your opponent, the theory is that it goes a long way to driving success in terms of out scoring your opponent and ultimately winning games. There is a lot of evidence to support the case that Corsi is a major component of on-ice success. While I believe many people put too much weight on Corsi statistics, I do accept that it is a major component of success.

Over the past few weeks, I have looked at the Leafs performance this season under Randy Carlyle and under Peter Horachek. First I looked at how zone start usage has change from Carlyle to Horachek and the impact of those changes on Corsi. Last week, I looked at a WOWY analysis of Tyler Bozak and David Booth to see if change in linemates from Carlyle to Horachek accounted for the changes in results. The conclusion from these posts is that a significant portion of the Leafs’ improved Corsi statistics is driven by the Leafs top line, and that outside of the top line not a lot has changed with respect to their Corsi statistics. To highlight the improvement in the Leafs top line, here are their 5v5tied CF% statistics under Carlyle and under Horachek.

Bozak CF% Kessel CF% JVR CF%
under Carlyle 38.4 41.0 39.0
under Horachek 53.6 52.0 55.0
Difference 15.2 11.0 16.0

Under Carlyle, the trio of Bozak, Kessel and JVR were pretty close to a league-worst Corsi line, with Bozak being the worst of the three. Under Horachek, they are well above the break even 50.0% line and have put up pretty good Corsi percentages. As far as Corsi is concerned, this trio went from downright awful to well above average. All it took was, I presume, a playing style change demanded by a new coach.

For several years it has been believed that Corsi is an important tool in evaluating players. It was a major component of what has driven the analytics community to conclude that Bozak is a poor hockey player. The evidence above suggests that a simple playing style change can drive Corsi from downright terrible to pretty good. This leads to a bit of a dilemma within hockey analytics, which I will call the Bozak-Corsi dilemma, with two serious questions that need to be answered:

  1. Is Bozak now a pretty good player?
  2. More importantly, if a player (or a forward line) can dramatically alter their Corsi overnight seemingly solely through changing playing style (driven by a coaching change), it must be concluded that Corsi is not primarily driven by individual player talent.

The first point will provide some angst within the Leafs fan base, but from my perspective the answer is no because of his (goal) WOWY’s, Points/60, IPP, etc. are also pretty weak, although maybe he isn’t as bad as previously thought if he plays an optimal playing style.

The second point is critically important, though, because it basically implies that Corsi has significantly less value (maybe little or no value) in individual player evaluation than previously thought, which should send ripples throughout the hockey analytics community. If Corsi is largely driven by playing style, one must conclude it isn’t an individual skill? It isn’t something I’d conclude based on three players, but it definitely makes you think about it more.


Feb 212015

I wasn’t actually planning on writing anything formal about the new enhanced hockey stats on but this post over at Jewels From The Crown was kind of the last straw.

Before getting into that article let me say a few things. Despite the fact that I run a popular hockey stats site I really wanted to see the NHL do a good job on their advanced hockey stats site. I honestly don’t see them as a competitor nor do I really care if they are because I make no money off the site and my interest lies as much in analysis and research as it does in producing and running a website. I also see the site being more geared to the average, more casual user while my site is geared more towards the hard core user and researcher. I love hockey, I love hockey statistics and hockey analytics, and I really would have loved to see the NHL do this right to bring this to a wider audience than I, or any of the other hockey stats sites, ever could. While I still have that hope my thoughts on this first attempt is that is a very poor effort that could have gone much better.

So, now, what set me on this bit of a tirade. Well, the post at Jewels From the Crown featured an interview with Chris Foster, NHL Director of Digital Business Development, and Gary Bettman. In it Sheng Peng asked about what ‘exclusive’ stats that offers over other sites such as mine. This was a portion of the answer.

Foster: I’ve got to double check. I’m not sure. There’s zone starts, I think those are completely brand-new. And the level of depth that we’re doing with primary and secondary assists. I don’t think anybody’s going to have that much detail. That first batch—shot attempts and unblocked shot attempts–there’s a lot of that. It’s that second batch of stats—primary assists and penalties drawn over time—those are the ones that will be more unique to the site. They may be out there but not to the level of depth that will be on

Zone starts?  Brand-new? Really? Zone starts have been around for years. Primary assists are exclusive to Really? I’ve had them on my site for years too. I even go a step further and look at primary points (goals + primary assists). Penalties drawn has been around elsewhere too. I’ll give the NHL the benefit of doubt and believe that they are actually oblivious to what else is being done out there because otherwise they are outright misleading and belittling the hard work that I and many others have done previously. Looking at their enhanced statistics site it is clear that they haven’t really put much thought into this whole project or reached out to the analytics community because they are tons of things that I think many would suggest they do differently. Here are a few examples:

  1. SAT and USAT are short for Shot ATtempts and Unblocked Shot ATtempts otherwise known as Corsi and Fenwick respectively. I am OK with the name change but for the NHL’s target audience it is absolutely unnecessary to use both. Even myself as an analytics person at times wonder why we have both Fenwick and Corsi. They are extremely highly correlated and the benefits of one over the other is generally very minimal. For the casual user it is completely unnecessary to burden them with these two separate stats. It would have been far better to simply use shot attempts (Corsi) and leave out the unblocked variety. Shot attempts are simple, straight forward, and easy to understand what they are.
  2. The Skater Shooting/Time on Ice stats have both /20 and /60 statistics which is redundant and pointless. One is just 3 times the other. Why one would see the need to present both side by side on the same page is beyond me. Let’s present a stat. Ok, now lets multiply it by three and present that too. Who thinks like that? Really? Who? Furthermore, when I started my site I used /20 stats because I figured a good player plays about 20 minutes per game. Other sites used /60 because a game is 60 minutes long and it tells how that caliber of player would produce in a full game. Both have merit but for the purpose of consistency across I have converted all my stats to /60. Had they reached out to me I’d have told them this and they may very well have done the smart thing and just present /60 stats.
  3. Having a stats site and not being able to filter based on games played or time on ice it practically useless. When I sort by SAT% I want to see who the best players are who play regular or semi-regular shifts. Instead the top of the list is dominated by AHL call-ups with one or two games that nobody has heard of and nobody cares about. Why do this?

There are numerous other smaller mistakes as well (see Eric Tulsky’s twitter time line for a few of them). It’s a shame really because I was hoping for and expecting for a whole lot better. I applaud the NHL for hopping on the ‘enhanced statistics’ bandwagon but what they released today screams of a poorly thought out beta release of a product developed by a group of amateurs, a long way from a major new product release from a multi-billion dollar organization (NHL) backed up by another multi-billion dollar organization (SAP) which they promoted it as being.

I really do hope that the NHL gets their act together and makes it work as I think it will be good for everyone. The NHL, the casual fan, and those in the hockey analytics community. We all benefit when the NHL does things well. We are, at the core, all hockey fans. My hope with this post is that it inspires the NHL to spend more time reaching out to the people that have been doing this for years. We have years of experience, knowledge and expertise that would have helped avoid many of the basic and senseless missteps we see today. If you are with the NHL and are reading this I want you to know that I more than willing to share my experience and I am certian most everyone in the hockey analytics community would as well. You just have to ask. My e-mail is

Stat Site Upgrades

 Uncategorized  Comments Off on Stat Site Upgrades
Feb 022015

Some of these have been announced on twitter but I have recently made some upgrades to and Here is a list of the upgrades.

New Situations:

  • Home and Road for 5v5 Tied, 5v5 Close, 5v5 Leading and 5v5 Trailing
  • 4v4
  • All Situations (all play)
  • All Power play (includes 5v4, 5v3, 4v3)
  • All Short handed (includes 4v5, 3v5, 3v4)

Multi-year stats with current season

  • 2013-15 (2yr), 2012-15 (3yr) and 2011-14 (4yr) stats have been added
  • Multi-year stats up to 2007-15 (8yr) will be added in the off season – too much data to update nightly.

 WOWY Zone Starts ( only)

  • WOWYs now include OZFO%, DZFO%, NZFO% and OZone% (tweaked UI a bit from initial release yesterday)
  • WOWYs also now /60 instead of /20 as were previously (now consistent with rest of site and
  • “Against You” stats now available for current season (currently only opponents with >15min TOI against but this will drop to 5 min. after update tonight)

Percent of Team Zone Starts ( only)

  • Now available is the percent of a teams (in games player played in only) offensive/defensive/neutral zone starts the player was on the ice for.

Various bug and data fixes

  • Fixed issue with Percent of Team stats for special team situations
  • Manually fixed a bunch of errors in shift tables over past 4 seasons (should improve reliability of data)
  • Probably some others I have forgotten about


That’s all for now. As usual, if you find any problems or have any more requests for enhancements let me know.