David Johnson

Jul 012014
 

The other day I commented on twitter that I would be happy if the Leafs signed defenseman Mike Weaver because I think he is a defensive defenseman that I think the Leafs could really use. I have thought of Mike Weaver as a premier defensive defenseman for quite some time now. I always seem to get a little flak over it but that’s fine, I can handle it. For example, as a response to my Weaver comment on twitter Eric Tulsky thought it would be prudent to point out a “flaw” in my thought process.

 

And of course, Tyler Dellow never passes up an opportunity to take a jab at me (or anyone who he disagrees with) took the opportunity to re-tweet it.

Now, of course I had thought of responding with a tweet to the effect of “Florida’s save percentage was probably is a bit of a factor in that regression” but I didn’t want to get into a twitter debate at that moment and I was confident I could come up with more concrete evidence. So here is that evidence.

SavePercentageWeaverOnOffIce

The above chart shows the save percentage of Weaver’s team when Weaver is on the ice vs when Weaver is not on the ice including only games in which Weaver has played in (i.e. it is better than just using team save percentage for that season and also allows us to combine his time in Florida and Montreal last season). As you can see, there has only been one season in the last 7 in which his team had a worse save percentage when he is on the ice than not. That is reasonably compelling evidence. It’s difficult to say what happened that season but his main defense partners were a young Dmitry Kulikov and Keaton Ellerby so maybe that was a factor. An investigation of Kulikov’s and Ellerby’s impact on save percentage over the years may help us identify why Weaver slipped that year. It could have been a nagging injury as well. Or, it could just be randomness associated with save percentage.

Regardless of the “reason” for the slide in 2011-12 it is pretty difficult to argue that there has been significant “regression” the past 3 seasons as Tulsky and Dellow so eagerly wanted to point out as the past 2 seasons Weaver has seemingly had a significant positive impact on his teams save percentage. Since I made that statement there has been one seasons of “regression” so to speak and two seasons in support of my claim. I guess that means it is 2-1 in my favour. It continues to appear that Weaver is a good defenseman who can suppress shot quality against.

Another defenseman I have identified as a defenseman who possibly can suppress opposition save percentage is Bryce Salvador. Here is Salvador’s on/off save percentage chart similar to Weaver’s above (2010-11 is missing as Salvador missed the season due to injury).

SavePercentageSalvadorOnOffIce

Salvador’s on-ice save percentage has been better than the teams save percentage every year since 2007-08. Regression? Doesn’t seem to be.

To summarize, there are a lot of instances where if we simply do a correlation of stats from one year to the next or  make observations of future performance relative to past performance we see the appearance of regression. In fact, the raw stats do in fact regress. That doesn’t necessarily mean the talent doesn’t exist, just that we haven’t been able to properly isolate the talent. The talent of the individual player is only a small factor in what outcomes occur when he is on the ice (a single player is just one of 12 players on the ice during typical even strength play) so it is difficult to identify without attempting to account for these other factors (quality of team mates in particular).

Possession and shot generation/suppression is important, but ignore the percentages at your peril. They can matter a lot in player evaluation.

 

Jun 232014
 

More often than not the first thing I look at when I want to evaluate a player is their WOWY stats to see if the player boosts the performance of their teammates or suppress it when he is on the ice. Let’s take a look at a WOWY comparision of Umberger and Hartnell starting with some links to their WOWY pages.

When on any of those pages you can click “Visualize this table” to get some charts that I find are often a quick way of getting an overview of the player in question. For example, here is a CF% WOWY chart for Hartnell from last year.

Hartnell-CF-WOWY-2013-14

In these charts it is better to have bubbles below and to the right of the one-to-one diagonal line from that runs from the lower left to the upper right. For Hartnell in 2013-14 every single teammate was the the lower right of this diagonal line which is really good. Not a lot of players have charts this nice. If you go back and look at previous years you will see that Hartnell has accomplished this relatively consistently. This is a good thing. Now let’s take a look at Umberger’s.

Umberger-CF-WOWY-2013-14

That is a much less impressive chart as the majority of Umberger’s team mates have performed better when not playing with him. This is not good and yet is is fairly typical for Umberger to have WOWY charts that look like this.

This is a table of how Umberger’s line mates performed with and without Umberger last season. Listed are all forwards who played at least 100 minutes of 5v5 ice time with Umberger.

Line mate With Umberger Without Umberger
Ryan Johansen 50.2% 50.8%
Nick Foligno 50.4% 52.0%
Artem Anisimov 40.1% 53.3%
Blake Comeau 46.1% 54.6%
Mark Letestu 42.8% 52.1%

And now for Hartnell’s line mates who played at least 100 minutes with Hartnell last year.

Line mate With Hartnell Without Hartnell
Claude Giroux 55.7% 49.5%
Jakub Voracek 57.1% 52.5%
Brayden Schenn 51.9% 46.3%
Wayne Simmonds 53.9% 46.3%

Again, you can go back to previous seasons and the general trend for the two players is pretty much the same. Players perform worse when playing with Umberger than when not and players perform better when playing with Hartnell than when not.

From a WOWY perspective, Umberger is a below average player and Hartnell is an above average player. In fact there aren’t many players that have WOWY charts that look better than Hartnell’s except for the true star players (such as Toews, or Bergeron, or Kopitar, etc.).  Hartnell in my opinion is easily a top 6 player. Umberger I am not sure I’d really want on my team in any significant role. With this trade the Blue Jackets get better in two ways. First by adding a good player in Hartnell and second by subtracting a poor player in Umberger (classic case of addition by subtraction).

 

Jun 192014
 

My intention is to add primary point totals to stats.hockeyanalysis.com sometime this summer but I have calculated them over the past 4 seasons during 5v5close play and thought I’d present a more complete leader board here (I mentioned the top 5 on twitter already).

Rank Player Name PPts/60
1 CROSBY, SIDNEY 2.87
2 STAMKOS, STEVEN 2.17
3 MALKIN, EVGENI 2.11
4 TOEWS, JONATHAN 2.04
5 KADRI, NAZEM 1.98
6 SKINNER, JEFF 1.96
7 TAVARES, JOHN 1.95
8 VANEK, THOMAS 1.95
9 PERRY, COREY 1.93
10 GIROUX, CLAUDE 1.91
11 KUNITZ, CHRIS 1.89
12 SEDIN, DANIEL 1.86
13 KESSEL, PHIL 1.85
14 RYAN, BOBBY 1.84
15 KANE, PATRICK 1.83
16 SEMIN, ALEXANDER 1.81
17 EBERLE, JORDAN 1.81
18 PACIORETTY, MAX 1.81
19 WHEELER, BLAKE 1.80
20 HALL, TAYLOR 1.78
21 KOPITAR, ANZE 1.77
22 BENN, JAMIE 1.76
23 COUTURE, LOGAN 1.75
24 SELANNE, TEEMU 1.75
25 SEGUIN, TYLER 1.73
26 POMINVILLE, JASON 1.73
27 KANE, EVANDER 1.72
28 CARTER, JEFF 1.71
29 SHARP, PATRICK 1.71
30 PERREAULT, MATHIEU 1.69
31 PAVELSKI, JOE 1.69
32 DATSYUK, PAVEL 1.69
33 FRANZEN, JOHAN 1.68
34 LADD, ANDREW 1.68
35 NASH, RICK 1.68
36 LUPUL, JOFFREY 1.68
37 TANGUAY, ALEX 1.67
38 PERRON, DAVID 1.66
39 DUPUIS, PASCAL 1.66
40 DUCHENE, MATT 1.65
41 STEEN, ALEXANDER 1.65
42 ST._LOUIS, MARTIN 1.65
43 WILSON, COLIN 1.63
44 KREJCI, DAVID 1.63
45 IGINLA, JAROME 1.62
46 JAGR, JAROMIR 1.61
47 WHITNEY, RAY 1.61
48 GRABNER, MICHAEL 1.60
49 MACARTHUR, CLARKE 1.60
50 HORTON, NATHAN 1.59

It is amazing how far ahead of everyone Crosby is. He is in a league of  his own offensively. Most of the names on here you’d expect but it is surprising to see Kadri that high as well as Perreault at #30 who the Ducks picked up pretty cheaply from the Capitals last September.

Primary points are goals and first assists (secondary assists are not included).

5v5close play is 5v5 play when teams are within 1 goal of each other in first or second period or tied in the third period.

 

Jun 122014
 

The rumour is out there that Sunny Mehta has been hired as Director of Hockey Analytics of the New Jersey Devils (if true, a big congrats to Sunny). This sparked some twitter discussion about the Devils and analytics and Devils defensemen including Bryce Salvador.

I have been a bit of a fan of Salvador, at least statistically, though clearly there are a lot of Devils fans that do not like him and I think it is because of a focus on corsi. One person tweeted me an image of Salvador’s corsi rel % suggesting it was “pretty ugly”. While maybe true the game isn’t about Corsi it is about goals. Here is what I know about Salvador. In 5v5close situations he led the Devils defensemen in on-ice save percentage last season, the season before, and the season before that. He missed 2010-11 due to injury but in 2009-10 he was second best trailing only Andy Greene, his regular defense partner. Either he is extremely lucky (every year) or he is doing something right.

Lets look at this a different way. Over the past 3 seasons Bryce Salvador has had the third best 5v5close save percentage in the league when he is on the ice despite the Devils ranking 23rd in team save percentage. The two players ahead of him play for Boston (Dougie Hamilton) and Los Angeles (Willie Mitchell) who have significantly better goaltending (3rd and 8th best 5v5close save percentages over past 3 seasons) and again, they played in front of far better goaltending.

In February 2012 I wrote an article attempting to quantify a defenders effect on save percentage and in it I identified Salvador as one of the best defensemen at boosting his teams save percentage. In the 2 seasons since he has done nothing but support that claim.

So, what does this all mean? Well, it takes a player who had a team worst 15.9 CA/20 in 5v5close situations this past season to a team best 0.49 GA/20.  Over the past 3 seasons only Dougie Hamilton (Boston), Willie Mitchell (Los Angeles) and Alec Martinez (Los Angeles) have seen goals scored against them at a lower rate than Bryce Salvador.

I know the majority of people are on the corsi bandwagon these days and some will dismiss any argument that runs counter to it but I think the evidence is clearly on Salvador’s side here. All evidence suggest he is really good as suppressing opposition shot quality and in turn suppressing the number of goals scored against the Devils. If I were the new Director of Hockey Analytics for the Devils I wouldn’t be recommending getting rid of Salvador.

 

Jun 062014
 

I am in the process of planning off season upgrades to stats.hockeyanalysis.com and I am seeking your input as I know a number of you have made suggestions/requests in the past (some of which I haven’t kept track of unfortunately). Here are some of my planned upgrades.

  • Generally I’d like to add more charts and graphs to complement the stat tables that currently dominate the site, especially on the player and team summary pages. If you have any thoughts/examples on how best to visualize the data let me know.
  • I am likely to add new situations such as 4v4, 5v5close home/road splits, all situations, and maybe 5v5 by period. Any others you would like to see?
  • Clean up of the player pages adding charts and graphs and more summary statistics.
  • Addition of team pages.
  • I have some new usage statistics that I want to add such as ratio of ice time leading vs ice time trailing.
  • I may consider removing the HARO/HARD/HART ratings because they take up a lot of space, are time consuming to calculate, and not well used. May replace with something similar if I have some time to do some research. Would anyone object if I removed them completely?
  • WOWY usage statistics would also be cool to do to see if player usage changes with and apart from other players.

My ultimate goal, which will require a fairly significant overhaul of my code and database structure, is to add the ability to calculate statistics, including WOWY’s, for any specified period of time. This is non-trivial and potentially very time consuming though so no guarantees here but this is one of the more common requests that I get.

I may also look into 3-player “with-you” stats as well because I know there is  interest in seeing how a complete forward line performs together, not just pairs of players.

I also have some ideas on some research projects I want to do this summer so everything is time permitting but I really hope to have some nice upgrades for next season. If you have any other suggestions or requests please add them in the comments.

 

 

May 112014
 

I often feel that I am the sole defender of goal based hockey analyitics in a world dominated by shot attempt (corsi) based analytics. In recent weeks I have often heard the pro-corsi crowd cite example after example of where corsi-based analytics “got it right” or “predicted something fairly well”. While it is always good to be able to cite examples where you got things right a fair an honest evaluation looks at the complete picture, not just the good outcomes. Otherwise it is analytics by anecdotes which is an oxymoron if there every was one.

For example, Kent Wilson of FlamesNation.ca recent wrote about the “Dawning of the Age of Fancy Stats” in which he cited several instances of where hockey analytics got it right or did well in predicting outcomes.

The big test case which seems to have moved the needle in favour of the nerds is, of course, the Toronto Maple Leafs. Toronto came into the season with inflated expectations after an outburst of percentages during the lock-out shortened year saw them break into the post-season. Their awful underlying numbers caused the stats oriented amongst us to be far more circumspect about their chances, of course.

Toronto is the recent example that the hockey analytics crowd likes to bring up in support of their case but it is just one example. We don’t hear much about how many predicted the Ottawa Senators to be in the playoffs and some even had them challenging for the top spot in the eastern conference. We don’t hear much about how the New Jersey Devils missed the playoffs yet again despite having the 5th best 5v5close Fenwick% in the league, the year after missing the playoffs with the 3rd best 5v5close Fenwick% in the league. If we are truly interested in hockey analytics we need a complete and unbiased assessment of all outcomes, not just the ones that support our underlying belief.

In the same article Kent Wilson quoted a tweet from Dimitri Filipovic about the success of Corsi in predicting outcomes of playoff series.

Relevant #fact: since ’08 playoffs, teams that were 5+ % better than their opponent in 5v5 fenwick close during the regular season are 25-7.

While interesting, what it really doesn’t tell us a whole lot more than “when one team is significantly better at outshooting their opponents they more often than not win”. Well, that really isn’t saying a whole lot. It is more or less saying, when a dominant team plays a mediocre team, the dominant team usually wins. Not really that interesting when you think of it that way.

Here is another fact that puts that into perspective. Since the 2008 playoffs, the team with the better 5v5close Fenwick% has a 53-35-2 record (there were 2 cases where teams had identical fenwick% to 1 decimal place). That actually makes it sound like 5v5close Fenwick% is predictive overall, not just in cases where one team is significantly better than another. Of course, if we look at goals we find that the team with the better 5v5close goal% has a 54-34-1 record. In other words, 5v5close possession stats did no better at predicting playoff outcomes than 5v5close goal stats. It is easy to throw out stats that support a point of view, but it is far more important to look at the complete picture. That is what analytics is about.

A similar statistic was promoted by Michael Parkatti in a recent talk on hockey analytics at the University of Alberta. In that talk Parkatti stated that of the last 15 Stanley Cup winners all but 3 had a “ShotShare” (all situations) of at least 53%. The exceptions were Pittsburgh in 2009, Boston in 2011 and Carolina in 2006. I will note that it appears that all three of these teams are below 51% and 2009 Penguins were below 50%. That seems sort of impressive but I did some digging myself and found that every Stanley Cup winner since 1980 had a “GoalShare” (all situations) greater than 52%. Every single one. No exceptions. I didn’t look at any cup winners pre-1980 but the trend may very well go back a lot further. As impressive as 12 of 15 is, 34 of 34 is far more impressive.

Here is the thing. We know that goal percentage correlates with winning far better than corsi percentage. This is an indisputable fact. It is actually quite a bit better. The sole reason we use corsi is that goals are infrequent events and thus not necessarily indicative of true talent due to small sample size issues. This is a fair argument and one that I accept. In situations where you have small sample sizes definitely use corsi as your predictive metric (but understand its limitations). The question that needs to be answered is what constitutes a small sample size and more importantly what sample size do we need such that goals become as good or better of a predictor of future events than corsi. I have pegged this crossing point at about 1 seasons worth of data, maybe a bit more if looking at individual players who may not be getting 20 minutes of ice time a game (my guess is around >750 minutes of ice time is where I’d start to get more comfortable using goal data than corsi data). I am certain not everyone agrees but I haven’t see a lot of analyses attempting to find this “crossing point”.

Let’s take another look at how well 5v5close Fenwick% and Goal% predict playoff outcomes again but lets look by season rather than overall.

FF% GF%
2008 7-7-1 6-9
2009 9-6 11-4
2010 9-6 11-4
2011 10-5 11-4
2012 7-7-1 7-7-1
2013 11-4 8-7
Total 53-35-2 54-35-1

In full seasons not affected by lockouts we find that GF% was generally the better predictor (only 2008 did GF% under perform FF%) but in last years lockout shortened season FF% significantly outperformed GF%. Was this a coincidence or is it evidence that 48 games is not a large enough sample size to rely on GF% more than CF% but 82 games probably is?

I have seen numerous other examples in recent weeks where “analytics” supporters have used what amounts to not much more than anecdotal evidence to support their claims. This is not analytics. Analytics is a fair, unbiased and complete fact based assessment of reality. Showing why a technique is a good predictor some of the time is not enough. You need to show why it is overall a better predictor all of the time or at least define when it is and when it isn’t.

I recently wrote an article on whether last years statistics predicted this years playoff teams and found that GF% seemed to do at least as well as CF% despite last season being a lock-out shortened year.

With all that said, you will frequently find me using “possession” statistics so I certainly don’t think they are useless. It is just my opinion that puck possession is just one aspect of the game and puck possession analytics has largely been oversold when it comes to how useful it is as a predictor. Conversely goal based analytics has been largely given a bad rap which I find a little unfortunate.

(Another article worth reading is Matt Rudnitsky’s MONEYPUCK: Why Most People Need To Shut Up About ‘Advanced Stats’ In The NHL.)

 

Apr 292014
 

It seems every time a new hockey person gets hired these days they will get asked “do you believe in hockey analytics?” It started with Trevor Linden in Vancouver. Then Brendan Shanahan in Toronto. And today Brad Treliving in Calgary. Nichols on hockey has a good rundown on both Treliving’s and Burke’s response to the question today so go give it a read.

As we all know, Brian Burke is an analytics skeptic to say the least. A popular Brian Burke quote is the following:

“Let’s get the record straight on that too. The first analytics systems I see that’ll help us win, I’ll buy it. I’ll pay cash so that no one else can use it. I’m not a dinosaur on that.”

What Burke gets wrong here is that analytics is not a “system” you can buy but rather it is a thought process and a way of doing business. Walmart is famous for using analytics to maximize the profits of their retail operation by knowing their customers buying habits, knowing what their customers will buy, how much they will buy and when they will buy it based on everything from the weather to the economy. Analytics is a huge part of their success. That said, there is no analytics “system” that another retailer can purchase off the shelf that will allow them to do the same. It isn’t a system that makes Walmart so successful it is the way they use analytics to operate their business that permeates the entire operation that makes them successful. Every retail operation has a different customer base. Every retail situation has a different set of products they sell. Every retail situation has a different cost structure. There is no single “system” that can be applied that will guarantee retail success. That doesn’t mean that every retail operation can’t benefit from analytics because analytics is a way of doing business. It is the mindset of wanting to know as much as you can and applying unbiased analyticical techniques to that knowledge to drive decision making. It is the mindset of wanting to know as much about your customers buying habits as you possibly can. It is the mindset of wanting to know what your customers will want to buy and when and why. It is a mindset of knowing how many employees you need on staff at a given time to maximize sales and profits. It is about wanting to know how long a line up customers will tolerate before the leave and make a purchase elsewhere. Analytics is a way of thinking that permeates throughout your organization, it is not a “system” that you can buy and apply.

I don’t know the extent that NHL teams are using hockey analytics but I get the feeling that there are very few that are doing so in a real serious way. Being a highly analytical person I may be biased but to me an NHL team that truly adopts hockey analytics would see the idea of analytics permeate throughout the organization. Analytics should be an important driver of coaching tactics and decisions. It should be an important driver of scouting and player evaluation. It should be an important driver of team building. It should be an important driver of maximizing salary cap commitments. It also should not be one-directional as I firmly believe hockey analytics can benefit significantly from the hockey knowledge of players, coaches, general managers and scouts to improve and test analytical techniques. I have my doubts that there are many NHL organizations that have truly adopted hockey analytics when defined in that way. Some may be dabbling, few are truly adopting.

Interestingly though, I suspect there isn’t one NHL organization that doesn’t use analytics in a significant way on the business side of the organization to do everything from setting ticket, beer and hot dog prices, to setting advertising rates to evaluating their sales staff effectiveness. I am certain analytics permeates through the business side of an NHL organization in a significant way so it is kind of surprising there is any resistance to it on the hockey side.

 

Apr 122014
 

As of last night games all 16 playoff teams have been determined. Before I get into any playoff predictions, lets take a look at how last seasons 5v5 close statistics do at predicting who would make the playoffs this season.

CFPctGFPctPlayoffPredictor

The above table shows last years 5v5close GF% and CF% and the teams in red are this years playoff teams.

There were 15 teams with a CF% above 50% last year, 9 of them made the playoffs this year while 6 missed. Of the remaining 15 teams that had sub 50% CF% last year, 7 of them made the playoffs this year while 8 missed. Seven of the top 10 CF% teams last year made the playoffs while 5 of the bottom 10 teams made the playoffs and 5 missed.

There were 18 teams last year with at least 50% GF% and 11 made the playoffs this season while 7 missed. Of the 12 teams that failed to reach 50% GF% last season, 5 made the playoffs and 11 missed. Seven of the top 10 GF% teams made the playoffs last season while 7 of the bottom 10 missed the playoffs.

Difficult to say one was significantly better than the other. Truth is, neither was particularly good but with 7 of the bottom 9 GF% teams last year missing the playoffs this year that might be enough to give GF% a slight edge. That said, the better predictor might have been last seasons point totals.

PtTotalsPlayoffPredictor

 

Apr 082014
 

The past few weeks while I have been shifting my website from one web host to another in an attempt to fight off the DDoS attacks I started thinking about how big my stats.hockeyanalysis.com database actually is. I was thinking about it because of how long it takes to upload the data to a new web host and how long it takes to set up the database again.

So, how many data points do I have in my database?  A lot. A data point is any single piece of data like the Leafs 2008-09 CF% or Jarome Iginla’s 2007-13 (6yr) individual Goals/60 or Jack Johnson’s CF% while playing with Drew Doughty during the 2008-09 season. Each of those is a single data point.

Here is a summary of all the data point totals by table type.

Database Table Type Total Records Datapoints/record Total Data points
Individual+OnIce Stats 595726 123 73274298
WOWY 3983667 54 215118018
“Against You” 10856454 38 412545252
Team Data 660 28 18480
Total 700956048

So yes, there are just over 700 million data points in my database not including things like player names, player positions, players team, etc. Once I add in all the multi-year data that includes this current season I estimate there will be over 900 million datapoints.

The majority, though not all (I’d estimate 70-80%), of these data points are accessible to you if you conduct the right searches. Which one of you is going to be the first to count them all?

Now, if I actually uploaded all the data I can generate (specifically WOWY and Against You data when players have played fewer than 5 minutes with/against each other) the number of data points would rise dramatically, probably several billion data points. This is why I don’t upload that data.

 

Apr 012014
 

Last week Tyler Dellow had a post titled “Two Graphs and 480 Words That Will Convince You On Corsi%” in which, you can say, I was less than convinced (read the comments). This post is my rebuttal that will attempt to convince you on the importance of Sh% in player evaluation.

The problem with shooting percentage is that it suffers from small sample size issues. Over small sample sizes it often gets dominated by randomness (I prefer the term randomness to luck) but the question I have always had is, if we remove randomness from the equation, how important of a skill is shooting percentage? To attempt to answer this I will look at the variance in on-ice shooting percentages among forwards as we increase the sample size from a single season (minimum 500 minutes ice time) to 6 seasons (minimum 3000 minutes ice time). As the sample size increases we would expect the variance due to randomness to decrease. This means, when the observed variance stops decreasing (or significantly slows the rate of decrease) as sample size increases we know we are approaching the point where any variance is actually variance in true talent and not small sample size randomness. So, without going on any further I present you my first chart of on-ice shooting percentages for forwards in 5v5 situations.

 

ShPctVarianceBySampleSize

Variance decline pretty much stops by the time you reach 5 years/2500+ minutes worth of data but after 3 years (1500+ minutes) the drop off rate falls off significantly. It is also worth noting that some of the drop off over longer periods of time is due to age progression/regression and not due to reduction in randomness.

What is the significance of all of this?  Well, at 5 years a 90th percentile player would have 45% more goals given an equal number of shots as a 10th percentile player. A player one standard deviation above average will have 33% more goals for given an equal number of shots as a player one standard deviation below average.

Now, let’s compare this to the same chart for CF/20 to get an idea of how shot generation varies across players.

 

CF20VarianceBySampleSize

It’s a little interesting that the top players show no regression over time but the bottom line players do. This may be because terrible shot generating players don’t stick around long enough. More importantly though is the magnitude of the difference between the top players and the bottom players.  Well, a 90th percentile CF20 player produces about 25% more shots attempts than a 10th percentile player and a one standard deviation above average CF20 player produces about 18.5% more than a one standard deviation below average CF20 player (over 5 years). Both of these are well below (almost half of) the 45% and 33% we saw for shooting percentage.

I hear a lot of ‘I told you so’ from the pro-corsi crowd in regards to the Leafs and their losing streak and yes, their percentages have regress this season but I think it is worth noting that the Leafs are still an example of a team where CF% is not a good indicator of performance. The Leafs 5v5close CF% is 42.5% but their 5v5close GF% is 47.6%. The idea that CF% and GF% are “tightly intertwined” as Tyler Dellow wrote is not supported by the Maple Leafs this season despite the fact that the Maple Leafs are the latest “pro-Corsi” crowds favourite “I told you so” team.

There is also some evidence that the Leafs have been “unlucky” this year. Their 5v5close shooting percentages over the past 3 seasons have been 8.82 (2nd), 8.59(4th), 10.54(1st) while this year it has dropped to 8.17 (8th). Now the question is how much of that is luck and how much is the loss of Grabovski and MacArthur and the addition of Clarkson (who is a generally poor on-ice Sh% player) but the Leafs Sh% is well below the past few seasons and some of that may be bad luck (and notably, not “regression” from years of “good luck”).

In summary, generating shots matter, but capitalizing on them matters as much or more.