Sep 182014
 

Earlier this week TSN announced the creation of an Analytics team consisting of long-time TSN contributor Scott Cullen along with new TSN additions of Globe and Mail’s James Mirtle and hockey blogger Travis Yost. I am all for main stream media jumping on board with hockey analytics but once you go from independent hockey blogger to a significant contributor to TSN I think it opens the door to higher expectations and higher standards.  Scott Cullen has a long track record with TSN and I am confident James Mirtle will bring some intelligent insight as we are all familar with and respect his work. While I am fully aware of Yost and his blogging history I have to be honest in saying that I have not read a ton of his stuff so I was interested to see what he would offer. After reading his first two articles, I have to say I definitely think there is room for improvement.

Yost’s first article was a look at some trends as to how teams use players during 5 on 5 play. The point I think Yost was trying to make most is that teams are phasing out goons and other “specialists” and replacing them with guys that can play bigger minutes and at both ends of the rink. While this may very well be true I am not sure Yost’s evidence to support this is really valid. He produced a chart that showed that more players are getting more 5v5 ice time per game in 2013-14 than in 2007-08 and his conclusion was that this was evidence of teams moving away from goons and small ice time players.

The rightward shift here should seem apparent – a higher concentration of guys playing larger minutes now as opposed to seven years ago and fewer guys picking up scrap minutes in smaller roles. The number of forwards playing ten or less minutes a night has dropped from 109 in 2007, to 65 in 2014. And the number of forwards playing between 13 and 16 minutes a night has moved from 153 in 2007 to 231 in 2014. As a group, teams may still be leaning on their star players, but there’s also been a more balanced spread of total ice time than there was seven years ago.

First off, the rightward shift that Yost talks about is likely almost exclusively due to the fact that there were far fewer penalties and power plays in 2013-14 than there were in 2007-08 as Yost pointed out earlier. This lead to there being more even strength ice time to be doled out to the same number of players. This will almost certainly produce a right shift as observed. As for a more balanced spread in ice time, I don’t see that either. At least not to any significant extent. If one really wanted to look at this properly instead of looking at number of minutes of even strength ice time played one would want to look at percentage of a teams even strength minutes the player played. This would eliminate the difference in total even strength ice time and truly allow you to see whether teams are using a more balanced line up or not. At the very least one should adjust each players ES TOI by an appropriate amount for one of the seasons based on the ratio of league-wide ES TOI between the two seasons. I’d then be interested to see if a “right shift” occurs or whether there is a meaningful difference in the charts.

Yost’s second article for TSN.ca was about Marc-Edouard Vlasic and how he should probably be getting more recognition for how good he really is. Now that is a sentiment I can generally support but Yost’s supporting evidence for this is analytically unsound in my opinion. The first thing Yost does is identify a number of defensemen who are generally considered the leagues best that we should compare Vlasic too. This is a good start and Yost identified guys like Chara, Doughty, Karlsson, Pietrangelo, Subban, etc. What Yost did next is produce a bubble chart that plots even strength corsi% on the x-axis vs even strength goals % on the y-axis with bubble size representing scoring production. To be honest, I have no clue what the value of this chart is. Both corsi% and goal% are significantly  team driven but there was no accounting for quality of team and goal% has a certain amount of luck and randomness associated with it which was not discussed and I really have no idea what statistic was used for scoring production. The conclusion Yost drew from this chart was that Vlasic was right in the mix with some of the best defensemen in the league. Problem is I am certain I could find a number of other defensemen we generally consider mediocre that would be right there with Vlasic.

There are proper ways to do this kind of analysis and there is no way one can do this without taking into consideration quality of teammates. On my stats site I have teammate statistics (denoted by TM) and one can easily do a comparison of how the players on-ice stats compare to their teammates when their teammates are not playing with them. Doing this we get the following:

Player Name CF60 RelTM
ERIK KARLSSON 9.115
DUNCAN KEITH 8.597
ALEX PIETRANGELO 8.202
MARK GIORDANO 6.695
P.K. SUBBAN 6.152
MARC-EDOUARD VLASIC 5.87
SHEA WEBER 2.072
RYAN MCDONAGH 2.032
DREW DOUGHTY -0.448
ZDENO CHARA -0.55
RYAN SUTER -1.518

If we use CF60 as a proxy for offensive production we find the best offensive defensemen are Karlsson, Keith and Pietrangelo while the least offensive are Suter, Chara and Doughty. Vlasic is right in the middle and looks pretty good. One might be surprised at Doughty but the rest kind of make sense.

Now, let’s do the same for CA60.

Player Name CA60 RelTM
MARK GIORDANO -12.251
MARC-EDOUARD VLASIC -9.205
P.K. SUBBAN -4.586
ERIK KARLSSON -2.21
ZDENO CHARA -1.69
DREW DOUGHTY -1.585
ALEX PIETRANGELO -0.211
RYAN SUTER 0.953
DUNCAN KEITH 2.385
SHEA WEBER 4.34
RYAN MCDONAGH 4.468

For CA60 it is better to have a negative number as this indicates you are giving up fewer shot attempts than your teammates when they aren’t playing with you. Here Vlasic is second and looking pretty good.

Now we can combine these two stats by looking at CF% RelTM.

Player Name CF% RelTM
MARK GIORDANO 8.9%
MARC-EDOUARD VLASIC 6.6%
P.K. SUBBAN 4.8%
ERIK KARLSSON 4.6%
ALEX PIETRANGELO 3.7%
DUNCAN KEITH 2.3%
DREW DOUGHTY 0.7%
ZDENO CHARA 0.6%
SHEA WEBER -1.0%
RYAN SUTER -1.2%
RYAN MCDONAGH -1.2%

Out of this group, Vlasic is second best which is pretty good and is evidence that he probably deserves to be in the company of these guys. Now, with that said, this is just a cursory look and in no way a complete analysis. Not only are there limitations by just looking at corsi but there are a lot of other factors that need to be taken into consideration as well (for example, Giordano is probably not that good, only looks good because his Flames teammates are not very good relative to the teammates of the other players on this list). Overall though, this is how I think one should start an analysis of Vlasic and whether he deserves more credit for the player he is. To be fair to Yost, he gets into this a little bit by looking at a timeseries of Vlasic’s Relative Corsi% but in no way is this sufficient and he doesn’t compare it to any of the other defensemen he is comparing Vlasic to.

Overall I applaud TSN for wanting to jump on the analytics band wagon and I am certain Yost has the potential to provide a better analytical view than his first few posts which, to be honest, left me a little underwhelmed if not disappointed.

On the flip side, I saw some good stuff written recently by @MimicoHero that I think is worthy of mention. A recent blog post of his looked at Ryan Johansen’s value to the Blue Jackets and he, in my opinion, did a pretty good job of accounting for usage (i.e. QoT, QoC, zone starts) and comparing Johansen to his peers. I like the tables he produced and how he looked at offense and defense separately. Now I’d probably want to weight QoT far more heavily in the usage metric he came up with but overall a very good methodology for comparing players on different teams playing in different circumstances.

 

Jun 182013
 

If you have been following the discussion between Eric T and I you will know that there has been a rigorous discussion/debate over where hockey analytics is at, where it is going, the benefits of applying “regression to the mean” to shooting percentages when evaluating players. For those who haven’t and want to read the whole debate you can start here, then read this, followed by this and then this.

The original reason for my first post on the subject is that I rejected Eric T’s notion that we should “steer” people researching hockey analytics towards “modern hockey thought” in essence because I don’t we should ever be closed minded, especially when hockey analytics is pretty new and there is still a lot to learn. This then spread into a discussion of the benefits of regressing shooting percentages to the mean, which Eric T supported wholeheartedly while I suggested that I think further research into isolating individual talent even goal talent through adjusting for QoT, QoC, usage, score effects,  coaching styles, etc. can be equally beneficial and focus need not be on regressing to the mean.

In Eric T’s last post on the subject he finally got around to actually implementing a regression methodology (though he didn’t post any player specifics so we can’t see where it is still failing miserably) in which he utilized time on ice to choose a mean for which a players shooting percentage should regress to. This is certainly be better than regressing to the league-wide mean which he initially proposed but the benefits are still somewhat modest. The results for players who played 1000 minutes in the 3 years of 2007-10 and 1000 minutes in the 3 years from 2010-13 showed the predictive power of his regressed GF20 to predict future GF20 was 0.66 which was 0.05 higher than the 0.61 predictive power raw GF20. So essentially his regression algorithm improved predictive power by 0.05 while there still remains 0.34 which is unexplained. The question I attempt to answer today is for a player who has played 1000 minutes of ice time, what is the amount of his observed stats that is true randomness and what amount is simply unaccounted for skill/situational variance.

When we look at 2007-10 GF20 and compare it to 2010-13 GF20 there are a lot of factors that can explain the differences from a change in quality of competition, a change in quality of team mates, a change in coaching style, natural career progression of the player, zone start usage, and possibly any number of other factors that might come into play that we do not currently know about as well as true randomness. To overcome all of these non-random factors that we do not yet know how to fully adjust for in order to get a true measure of the random component of a players stats we need to be able to get two sets of data that have attributes (QoT, QoC, usage, etc) as similar to each other as possible. The way I did this was to take each of the 6870 games that have been played over the past 6 seasons and split them into even and odd games and calculate each players GF20 over each of those segments. This should, more or less, split a players 6 years evenly in half such that all those other factors are more or less equivalent across halves. The following table shows how predicting the even half is at predicting the odd half based on how many total minutes (across both halves) that the player has played.

Total Minutes GF20 vs GF20
>500 0.79
>1000 0.85
>1500 0.88
>2000 0.89
>2500 0.88
>3000 0.88
>4000 0.89
>5000 0.89

For the group of players with more than 500 minutes of ice time (~250 minutes or more in each odd/even half) the upper bound on true randomness is 0.21 while the predictive power of GF20 is 0.79. With greater than 1000 minutes randomness drops to 0.15 and with greater than 1500 minutes and above the randomness is around 0.11-0.12. It’s interesting that setting the minimum above 1500 minutes (~750 in each even/odd half) of data doesn’t necessarily reduce the true randomness in GF20 which seems a little counter intuitive.

Let’s take a look at the predictive power of fenwick shooting percentage in even games to predict fenwick shooting percentage in odd games.

Total Minutes FSh% vs FSh%
>500 0.54
>1000 0.64
>1500 0.71
>2000 0.73
>2500 0.72
>3000 0.73
>4000 0.72
>5000 0.72

Like GF20, the true randomness of fenwick shooting percentage seems to bottom out at 1500 minutes of ice time and there appears to be no benefit to going with increasing the minimum minutes played.

To summarize what we have learned we have the following which is for forwards with >1000 minutes in each of 2007-10 and 2010-13.

GF20 predictive power 3yr vs 3yr 0.61
True Randomness Estimate 0.11
Unaccounted for factors estimate 0.28
Eric T’s regression benefit 0.05

There is no denying that a regression algorithm can provide modest improvements but this is only addressing 30% of what GF20 is failing to predict and it is highly doubtful that efforts to improve the regression algorithm any more will result in anything more than marginal benefits. The real benefit will come from researching the other 70% we don’t know about. It is a much more difficult  question to answer but the benefit could be far more significant than any regression technique.

Addendum: After doing the above I thought, why not take this all the way and instead of doing even and odd games do even and odd seconds so what happens one second goes in one bin and what happens the following second goes in the other bin. This should absolutely eliminate any differences in QoC, QoT, zone starts, score effects, etc. As you might expect, not a lot has changed but the predictive power of GF20 increases marginally, particularly when dealing with lower minute cutoffs.

Total Minutes GF20 vs GF20 FSh% vs FSh%
>500 0.81 0.58
>1000 0.86 0.68
>1500 0.88 0.71
>2000 0.89 0.73
>2500 0.89 0.73
>3000 0.90 0.75
>4000 0.90 0.73
>5000 0.89 0.71

 

Mar 152013
 

A few people didn’t like that I suggested that Jay McClement was a bad player in yesterday’s Mikhail Grabovski post so I thought I would provide a visual representation of McClement’s  mediocrity in the form of 5v5 Zone Start adjusted CF% WOWY charts for each of the past 6 seasons (this season included).

Let’s start with this current season even though the sample size is relatively small and so the number of line mates with a reasonable number of minutes with McClement is relatively small.

McClementCFPctWOWY201213

In this chart, it is better for McClement to have the bubbles below and to the right of the diagional line indicating his teammates corsi for % improved when they were on the ice with McClement. As you can see, none did.

So, what about previous seasons?

Continue reading »

Feb 212013
 

Over the past few years I have had a few discussions with other Leaf fans about the relative merits of Francois Beauchemin. Many Leaf fans argue that he was a good 2-way defenseman who can play tough minutes and is the kind of defenseman the Leafs are still in need of. I on the other hand have never had quite as optimistic view of Beauchemin and I don’t think he would make this team any better.

On some level I think a part of the difference in opinion is that many look at his corsi numbers which aren’t too bad but I prefer to look at his goal numbers which have generally not been so good. So, let’s take a look at Beauchemin’s WOWY numbers and see if there is in fact a divergence between Beauchemin’s corsi WOWY numbers and his goal WOWY numbers starting with 2009-11 5v5 WOWY starting with CF% WOWY.

Beauchemin200911CFWOWY

I have included a diagonal line which is kind of a ‘neutral’ line where players perform equally well with and without Beauchemin. Anything to the right/below the line indicates the player played better with Beauchemin than without and anything to the left/above they played worse with Beauchemin. As you can see, the majority of players had a better CF% with Beauchemin than without. Now, let’s take a look at GF% WOWY.

Beauchemin200911GFWOWY

While a handful of players had better GF% with Beauchemin, the majority were a little worse off. There is a clear difference between Beauchemin’s CF% WOWY and his GF% WOWY. What is interesting is this difference can be observed in 2007-08, 2009-10, 2010-11, and 2011-12 (he was injured for much of 2008-09 so his WOWY data is not reliable due to smaller sample size). Looking at his 5-year WOWY charts you get a clear picture that Beauchemin seemingly has a skill for ‘driving play’ but not ‘driving goals’. Let’s dig a little further to see if we can determine what his ‘problem’ by looking at his 2009-11 two year CF20, GF20, CA20 and GA20 WOWY’s.

CF20:

Beauchemin200911CF20WOWY

GF20:

Beauchemin200911GF20WOWY

As you can clearly see, Beauchemin appears to be much better at generating shots and shot attempts than he is at generating goals. The majority of players have a higher corsi for rate when with Beauchemin than when not with Beauchemin but the majority also have a lower goals for rate. What about ‘against’ rates?

CA20:

Beauchemin200911CA20WOWY

GA20:

Beauchemin200911GA20WOWY

For CA20 and GA20 is is better to be to be above/left of the diagional line because unlike GF%/CF%/GF20/CF20 it is better to have a smaller number than a larger number. There doesn’t seem to be quite as much of a difference between CA20 and GA20 as with CF20 and GF20 so the difference between CF% and GF% is driven by the inability to convert shots and shot attempts into goals as opposed to the defensive side of the game. That said, there is no clear evidence that Beauchemin makes his teammates any better defensively.

There are two points I wanted to make with this post.

  1. Leaf fans probably shouldn’t be missing Beauchemin.
  2. For a lot of players a corsi evaluation of that player will give you a reasonable evaluation of that player but there are also many players where a corsi evaluation of that player will not tell the complete story. Some players can consistently see a divergence between their goal stats and their corsi stats and it is important to take that into consideration.

 

Oct 292012
 

The other week I wrote about breaking down IPP (Individual Point Percentage, which is individual points divided by number of goals scored while the player was on the ice) into IGP (Individual Goal Percentage) and IFAP (Individual First Assist Percentage).  It seems IGP does a decent job of identifying the pure goal scorers and IFAP does a decent job of identifying the pure play makers.  I have always been interested in team/line makeup and how to maximize a lines performance so I decide to take a look at WOWY IPP comparisons for two pairs of extremely talented players who have at times played together and at times played on separate lines the past 5 years.  These are Crosby/Malkin and Thornton/Marleau.  Let’s start with Crosby/Malkin.

TOI IGP IFAP IPP G/60 FA/60 GF20
Crosby without Malkin 2527:07 35.7% 36.3% 84.7% 1.33 1.35 1.24
Crosby with Malkin 954:29 41.9% 30.2% 91.9% 2.26 1.63 1.80
Malkin without Crosby 3588:42 32.2% 38.3% 86.7% 0.97 1.15 1.00
Malkin with Crosby 954:29 27.9% 30.2% 75.6% 1.51 1.63 1.80

These two players have played significantly more ice time apart than with each other but still the comparison is interesting.  When separated Crosby IGP and IFP are very close together indicating he is relatively balanced between being a goal scorer and a playmaker but when he is playing with Malkin he becomes a more important goal scorer as his IGP rises from 35.7% without Malkin to 41.9% with Malkin and his IFAP falls from 36.3% without Malkin to 30.2% with Malkin.  Crosby got a point on 84.7% of all goals scored while he was on the ice without Malkin which is a very high number, but it rises to 91.9% when he is playing with Malkin which is a truly extraordinary number.

Malkin, strangely, sees both his IGP and his IFAP fall when playing with Crosby which means a smaller percentage of the goal production goes through Malkin when Crosby is on the ice. This makes sense since Crosby is in on nearly every goal scored when the two are on the ice together.  Interestingly, despite being in on a lower percentage of goals, Malkin did see his individual G/60 and individual FA/60 rise dramatically when playing with Crosby due to the fact that when those two are on the ice together they score goals at an exceptionally high rate.

I am not sure what to conclude here other than if you desperately need to score a goal late in the game it would be awfully smart to play these two together.  But, with that said, it may not be the most prudent use of resources during the course of the game because it seems to somewhat diminish Malkin’s ability to drive the play.  Now, lets take a look at Thornton/Marleau.

TOI IGP IFAP IPP G/60 FA/60 GF20
Thornton without Marleau 2585:10 24.6% 35.2% 79.6% 0.75 1.07 1.01
Thornton with Marleau 2438:22 19.3% 37.8% 74.8% 0.64 1.25 1.11
Marleau without Thornton 2808:03 32.3% 24.2% 69.7% 0.74 0.56 0.77
Marleau with Thornton 2438:22 37.8% 13.3% 73.3% 1.25 0.44 1.11

This shows that Thornton and Marleau are very different players.  Marleau is clearly much more of a goal scorer while Thornton is clearly much more of a play maker, and this is true regardless of whether they are playing together or apart.  When playing with Marleau Thornton sees his goal production drop from 0.75 G/60 to 0.64 G/60 but his FA/60 rise from 1.07 to 1.25.  For Marleau his G/60 rises significantly when playing with Thornton but his FA/60 falls a bit too and his IFAP falls to an astonishingly low 13.3%.  In short, Marleau’s goal production benefits a lot from playing with Thornton, while Marleau’s benefit to Thornton is a little less significant.  I believe if we continued this analysis to Thornton’s other line mates we will find that Thornton’s play making skills are easily the most significant driving force of the Sharks offense.

Having done this IPP WOWY comparison for these two pairs of players we can make some interesting observations and we can get a better idea of which player is driving the play when they are playing together (and apart).  That said, I think more work needs to be done to determine whether IPP WOWY is a useful player evaluation tool in general, or just something that might be interesting to look at in certain situations.  I’m curious what others think, or if you have another pair of players you want me to look at let me know (for example, Spezza/Alfredsson might be interesting).

 

Jul 112012
 

I have been wondering about the benefits of using 5v5 close data instead of 5v5 when we do player analysis and player comparisons.  The rationale for comparing players in 5v5close situations is that we are comparing players under similar situations.  When teams have a comfortable lead they go into a defensive shell resulting in fewer shots for but with a higher shooting percentage and more shots against, but a lower shooting percentage.  The opposite of course is true when a team is trailing.  But what I have been thinking about recently is whether there is a quality of competition impact during close situations.  My hypothesis is that teams that are really good will play more time with the score close against other good teams and less time with the score close against significantly weaker teams.  Conversely, weak teams will play more minutes with the score close against other weak teams than against good teams.

My hypothesis is that players on good teams will have a tougher QoC during 5v5 close situations than during overall 5v5 situations and players on weak teams will have weaker QoC during 5v5 close situations than during overall 5v5 situations.  Let’s put that hypothesis to the test.

The first thing I did was to select one key player from each of the 30 teams to represent that team in the study.  Mostly forwards were chosen but a few defensemen were chosen as well.  From there I looked at the average of their opponents goals for percentage (goals for / [goals for + goals against]) over the past 3 seasons in zone start adjusted 5v5 situations as well as zone start adjusted 5v5 close situations and then compared the difference to the players teams record over the past three seasons.  The table below is what results.

Player Team GF% 5v5 GF% Close Close – 5v5 3yr Pts Avg. Pts
Doan Phoenix 50.3% 50.6% 0.3% 303 101.0
Chara Boston 50.7% 50.9% 0.2% 296 98.7
Toews Chicago 50.4% 50.6% 0.2% 310 103.3
Datsyuk Detroit 50.8% 51.0% 0.2% 308 102.7
Weber Nashville 50.5% 50.7% 0.2% 303 101.0
Backes St. Louis 50.8% 51.0% 0.2% 286 95.3
E. Staal Carolina 50.4% 50.5% 0.1% 253 84.3
Ribeiro Dallas 50.5% 50.6% 0.1% 272 90.7
Gaborik Ny Rangers 50.1% 50.2% 0.1% 289 96.3
Malkin Pittsburgh 50.1% 50.2% 0.1% 315 105.0
Ovechkin Washington 49.9% 50.0% 0.1% 320 106.7
Enstrom Winnipeg 50.1% 50.2% 0.1% 247 82.3
Weiss Florida 50.3% 50.3% 0.0% 243 81.0
Plekanec Montreal 50.4% 50.4% 0.0% 262 87.3
Tavares NY Islanders 50.3% 50.3% 0.0% 231 77.0
Hartnell Philadelphia 50.1% 50.1% 0.0% 297 99.0
J. Thornton San Jose 50.9% 50.9% 0.0% 314 104.7
Kessel Toronto 50.1% 50.1% 0.0% 239 79.7
H. Sedin Vancouver 50.0% 50.0% 0.0% 331 110.3
Nash Columbus 50.9% 50.8% -0.1% 225 75.0
J. Eberle Edmonton 50.6% 50.5% -0.1% 198 66.0
Kopitar Los Angeles 50.6% 50.5% -0.1% 294 98.0
M. Koivu Minnesota 50.7% 50.6% -0.1% 251 83.7
Parise New Jersey 50.8% 50.7% -0.1% 286 95.3
Getzlaf Anaheim 51.0% 50.8% -0.2% 268 89.3
Roy Buffalo 50.3% 50.1% -0.2% 285 95.0
Stastny Colorado 50.3% 50.1% -0.2% 251 83.7
Spezza Ottawa 50.6% 50.4% -0.2% 260 86.7
Stamkos Tampa 50.2% 50.0% -0.2% 267 89.0
Iginla Calgary 50.5% 50.2% -0.3% 274 91.3
50.4% 50.5% >0 97.3
50.3% 50.3% =0 91.3
50.6% 50.4% <0 86.6

The list above is sorted by the difference between the oppositions 5v5 close GF% and the oppositions 5v5 GF%.  The bottom three rows of the last column is what tells the story.  These show the average point totals of the teams for players whose opposition 5v5 close GF% was greater than, equal to and less than the opponents 5v5 GF%.  As you can see, the greater than group had a team average 97.3 points, the equal to group had a team average of 91.3 points and the less than group had a team average of 86.6 points.  This means that good teams have on average tougher 5v5 close opponents than straight 5v5 opponents and weak teams have tougher 5v5 opponents than 5v5 close opponents which is exactly what we predicted.  It is also not unexpected.  Weak teams tend to play close games against similarly weak teams while strong teams play close games against similarly strong teams.

Another important observation is how little deviation from 50% there is in each players opposition GF% metrics.  The range for the above players is from 49.9% to 51.0%.  That is an incredible tight range and reconfirms to me the small importance QoC has an a players performance, especially when considering longer periods of time.

I also conducted the same study using fenwick for percentage as the QoC metric instead of goals for percentage but the results were less conclusive.  The >0 group had an average of 93.2 team points int he standings, the =0 group had 93.4 team points in the standings and the <0 group had 83.25 team points in the standings.  Furthermore there was even less variance in opposition FF% than GF% and only 12 teams had any difference between opposition 5v5 and opposition 5v5 close FF%.  For me, this is further evidence that fenwick/corsi are not optimal measures of player value.

Finally, I looked at the difference in player performance during 5v5 situations and found no trends among the different performance levels.  For GF% almost every player had their 5v5 close GF% within 4% of their of their 5v5 GF% (r^2 between the two was 0.7346) and for FF% every player but Parise had their 5v5 close FF% within 1.7% of their 5v5 GF% (r^2 = 0.945).  Furthermore, there was consistency as to which players saw an improvement (or decrease) in their 5v5 close GF% or FF% so it seems it might be luck driven (particularly for GF%) or maybe coaching factors.

So what does this all mean?  It means that in 5v5 close situations good teams have a bias towards tougher QoC than weak teams do.  Does it have a significant factor on player performance?  No, because the QoC metrics vary very little across players or from situation to situation (from my perspective QoC can be ignored the majority of the time).  Does it mean that we should be using 5v5 close in our player analysis?  I am still not sure.  I think the benefits of doing so are still probably quite small if there is any at all as 5v5 close performance metrics mirror 5v5 performance metrics quite well and in the case of goal metrics using the larger sample size of 5v5 data almost certainly supersedes any benefits of using 5v5 close data.

 

May 162012
 

I have written a few controversial pieces here at HockeyAnalysis.com (for example, my post on Luke Schenn back when Leaf fans thought Schenn was the best thing since sliced bread) and I suspect this might generate some controversy as well because of the conclusions made about Zetterberg’s weak defensive ability.  I also want to do this post to show how I believe we should be doing player evaluation because I believe that most people evaluate players in a poor way.

The first thing I believe is that we must evaluate players based on goals and not corsi/fenwick/shots because there is ample evidence that players can influence shooting percentage (for example here and here) and there is some evidence that players can influence save percentage (for example here).  Because of this, conducting a corsi based analysis will not give you a complete view of a players ability, and I think you will see some of that with Zetterberg.  Furthermore, I believe to get a more full and complete understanding of a players abilities we need to evaluate the players defensive ability and offensive ability separately which is what I will do.

As we are dealing with goal data which can fluctuate from season to season it is best to conduct a multi-season analysis to observe the greater trends, not what could be somewhat luck driven single season results.  Let’s start by looking at Zetterberg’s goals for per 20 minutes (GF20) and goals against per 20 minutes (GA20) on-ice stats and see how they rank league wide and on the Red Wings.

Year GF20 Team Rank League Rank
2011-12 1.157 3 of 11 21 of 312
2010-11 0.931 6 of 12 99 of 321
2009-10 1.073 1 of 10 34 of 319
2008-09 0.860 7 of 12 135 of 318
2007-08 1.193 2 of 12 16 of 312
2009-12 (3yr) 1.054 27 of 295
2007-12 (5 yr) 1.045 23 of 274

The data above is for 5v5 zone start adjusted ice time and the ranks are among forwards with 500 minutes of ice time for single seasons, 1500 minutes for the 3 year average and 2500 minutes for the 5 year average.  As you can see, from the league-wide rankings we are taking the top 9 or 10 forwards on each team so we are getting the majority of the players who get regular shifts in the NHL.  As you can see, offensively Zetterberg performs quite well and while there is some year to year fluctuation, the overall trend is that he has awfully good on-ice offensive numbers.  Now let’s take a look defensively.

Year GA20 Team Rank League Rank
2011-12 0.840 10 of 11 193 of 312
2010-11 0.969 11 of 12 278 of 321
2009-10 0.970 9 of 10 275 of 319
2008-09 0.795 7 of 12 158 of 318
2007-08 0.638 10 of 12 65 of 312
2009-12 (3yr) 0.924 261 of 295
2007-12 (5 yr) 0.845 184 of 274

Defensively, he is not only generally among the worst on his team, he is generally speaking among the bottom half of the league, or worse.  Over the past 3 seasons he ranks 261st of 295 players in terms of GA20 which puts him in the bottom 12 percent which to many I think is surprisingly bad.

Continue reading »

Apr 122012
 

With the Maple Leafs season having ended early once again a it is time to take an honest and unbiased look at what the team has and what the team needs to get to improve.  This will be a multi-post endeavour that will start with this post which will be a statistical evaluation of the Leafs forwards.  Included in each players evaluation is a table of their 5v5 zone start adjusted HARO+, HARD+ and HART+ ratings over the past 5 seasons (where available) as well as 3 and 5 year ratings.  These ratings provide an unbiased zone start, quality of teammate and quality of competition adjusted view of the player.

Joffrey Lupul

2011-12 2010-11 2009-10 2008-09 2007-08 2009-12 (3yr) 2007-12 (5yr)
HARO+ 1.538 1.124 1.401 1.132 1.330 1.341 1.116
HARD+ 0.633 0.800 0.895 0.853 0.822 0.730 0.790
HART+ 1.085 0.962 1.148 0.993 1.076 1.036 0.953

I have heard a number of people suggest that we should trade Joffrey Lupul because his value is as high as it has ever been.  Well, that may be the case but if you were the Ottawa Senators would you trade Erik Karlsson because his value is as high as it ever has been?  No.  Joffrey Lupul’s value may be as high as it ever has been, or ever will, but he is a really really good player and has been a really really good player for a number of years.  I should qualify that a bit and say offensive player because defensively he hasn’t ever been great but neither are a lot of the top offensive players in the league.  I think it can be argued that Lupul is the Leafs best offensive forward who makes the players around him better (See my Lupul’s always been this good article).  Kessel’s numbers drop off significantly when Lupul hasn’t been on the ice with him.  Over the past 2 seasons Kessel’s GF20 has been 1.281 when on the ice with Lupul and 0.641 when not on the ice with Lupul.  Yeah, we shouldn’t be talking about trading Lupul, we should be talking about signing Lupul to a long term contract extension.

Phil Kessel

2011-12 2010-11 2009-10 2008-09 2007-08 2009-12 (3yr) 2007-12 (5yr)
HARO+ 1.301 0.994 1.201 1.340 1.087 1.145 1.019
HARD+ 0.717 0.780 0.908 1.264 0.775 0.799 0.866
HART+ 1.009 0.887 1.054 1.302 0.931 0.972 0.942

Phil Kessel gets a lot of accolades for his individual goal scoring numbers and deservedly so, very few players put together 30 goal seasons for four straight years.  But his overall contribution to the team, while still quite good, doesn’t match that of Joffrey Lupul.  Lupul’s overall offensive contribution to the team is better and he does a better job of making the players around him better.  Furthermore, it seems Lupul’s defensive numbers are better too.  Now I don’t want to suggest that Kessel is a bad player, he is not, but he isn’t the #1 reason why the first line did so well this season.  Lupul is.

Tyler Bozak

2011-12 2010-11 2009-10 2008-09 2007-08 2009-12 (3yr) 2007-12 (5yr)
HARO+ 1.216 0.793 1.254 1.022
HARD+ 0.722 0.746 0.882 0.772
HART+ 0.969 0.770 1.068 0.897

There are a lot of differing opinions on Tyler Bozak.  Whenever I suggest the Leafs should trade him while his value is high and that he is not and will not ever be a first line center I often get a few people suggesting that he is still young and improving and his point totals are on an upward trend (27 points to 32 to 47 this past season).  While all is true and he may very well be a good offensive player he is dreadful defensively and that is the problem with Bozak.  With neither Kessel or Lupul being quality defensive players the Leafs need a center who can bring a defensive presence to that line.  Kessel and Lupul can create a lot of offense on their own so offensive ability is almost secondary.  The best thing for that line would to be to find a solid offensive forward with a strong defensive awareness and hopefully with a bit of size.  Bozak is not that guy.  He is also not as good as Mikhail Grabovski who has the second line center job locked up long term and without the defensive ability he can’t fit in on the third line either as it seems certain Carlyle will want that to be a checking line.  Bozak is a decent player, but there isn’t an opening on the Leafs roster for a player with his abilities and as such he should be used as trade bait to find a player who can fill the holes in the Leafs lineup better than he can.

Mikhail Grabovski

2011-12 2010-11 2009-10 2008-09 2007-08 2009-12 (3yr) 2007-12 (5yr)
HARO+ 1.287 1.390 1.278 0.987 1.265 1.309 1.074
HARD+ 0.938 1.015 1.048 0.873 0.605 1.014 0.951
HART+ 1.113 1.202 1.163 0.930 0.935 1.161 1.012

One can easily argue that Grabovski is the Leafs best all-round forward.  He has had three straight very good seasons both offensively and defensively (though there was a bit of a drop off on the defensive side this year, that is probably – hopefully – an anomaly).  He is the perfect second line center.

Clarke MacArthur

2011-12 2010-11 2009-10 2008-09 2007-08 2009-12 (3yr) 2007-12 (5yr)
HARO+ 1.086 1.251 1.014 0.896 1.194 1.110 0.949
HARD+ 1.041 0.971 0.810 0.874 0.907 0.946 0.943
HART+ 1.064 1.111 0.912 0.885 1.051 1.028 0.946

I have always had mixed opinions on Clarke MacArthur and I flip back and forth on whether we should keep him or whether we should use him as trade bait.  At this moment in time I am in the keep him camp as it seems he has enough offensive ability to easily be a second line winger and his defensive numbers have improved nicely over the past couple seasons (and you can’t say that about many Leaf players) .  So for now I am in the keep him camp.

Nikolai Kulemin

2011-12 2010-11 2009-10 2008-09 2007-08 2009-12 (3yr) 2007-12 (5yr)
HARO+ 1.180 1.343 0.987 1.168 1.161 1.021
HARD+ 1.024 0.999 1.171 0.735 1.085 0.989
HART+ 1.102 1.171 1.079 0.951 1.123 1.005

Kulemin’s individual offensive numbers dropped off the cliff this season when compared to last season, but in his four seasons in the NHL he has had 31, 36, 57 and 28 points so last season is probably more the anomaly than this season has been.  I like Kulemin and he plays a good 2-way game, but I just wonder if he is better suited to a 3rd line role.  It’s not so much that I don’t think he can be a good second line player, but rather that I think you could build a really nice checking line around him that can be depended on to shut down the opposing teams tip lines, but who can also score some goals too. that can also score some goals.  If you can build a quality checking line that as a line is capable of scoring 40-50 goals you can gain you a huge advantage over a lot of teams and if you can add a true 50+ point winger to Grabovski and MacArthur you improve the second line offensively as well.  Kulemin is an RFA and will probably want around $3M/year which is probably reasonable.  After a bit of an off year statistically he has lost some bargaining power so if Burke played hard he might be able to get him for $2.5M per year but for the sake of $500K/year make him happy with a 3 year $9M contract.

Tim Connolly

2011-12 2010-11 2009-10 2008-09 2007-08 2009-12 (3yr) 2007-12 (5yr)
HARO+ 0.914 1.065 1.264 1.543 1.287 1.063 1.026
HARD+ 0.761 0.870 1.097 1.006 0.942 0.900 0.934
HART+ 0.838 0.968 1.180 1.274 1.115 0.981 0.980

The three seasons from 2007-08 to 2009-10 were very good seasons for Connolly, both offensively and defensively.  When the Leafs signed Connolly last summer I had hoped that he could return to that form after a slip in 2010-11 but unfortunately he regressed even further.  In some respects it may not be all Connolly’s fault as we has bounced around a lot, from center to wing, from third line to first line, and even occasionally on the second line.  I am not sure how fair it is to evaluate a player under those circumstances but we kind of have to.  I just wish we could have seen Connolly play a long stretch of games between Kessel and Lupul to see if he could be a nice 60 point center with some defensive awareness.  Unfortunately we didn’t get that chance so I think if Burke can move his contract he needs to do that and let another team give him top six duty.

David Steckel

2011-12 2010-11 2009-10 2008-09 2007-08 2009-12 (3yr) 2007-12 (5yr)
HARO+ 0.683 0.583 0.979 0.711 0.766 0.730 0.625
HARD+ 0.860 1.097 1.347 1.104 1.063 1.077 1.086
HART+ 0.772 0.840 1.163 0.908 0.915 0.903 0.856

Despite the drop off in his defensive numbers, I kind of like the job that Steckel did this year on the third and fourth lines.  He was great on face offs and played a quality checking line center role and his defensive numbers took a hit when he played briefly with Kessel and Lupul (1.658 GA20 in 48 minutes with Kessel vs 0.843 in 617 minutes apart from Kessel).  Once Randy Carlyle took over as coach Steckel saw his minutes increase significantly as he was bumped up into full time 3rd line center duty matching up against some of the oppositions top players.  He doesn’t have the offensive ability if you are looking to build a 3rd line that can also score, but as a defensive checking center I am happy with him in that role.

Matt Lombardi

2011-12 2010-11 2009-10 2008-09 2007-08 2009-12 (3yr) 2007-12 (5yr)
HARO+ 0.900 1.293 1.410 0.700 1.097 0.920
HARD+ 0.689 0.898 0.896 0.979 0.804 0.877
HART+ 0.794 1.096 1.153 0.840 0.950 0.898

Lombardi, like Connolly, got bounced around a fair bit but he really didn’t show much in any role he was given.  His best years were when he was given a job as an offensive center on the top 2 lines but like Bozak and Connolly he won’t find that role with the Leafs.  Unfortunately like Connolly his contract may make him difficult to move but if you can you gotta let him go.

Colby Armstrong

2011-12 2010-11 2009-10 2008-09 2007-08 2009-12 (3yr) 2007-12 (5yr)
HARO+ 0.449 1.274 1.118 1.214 1.160 1.078 0.980
HARD+ 0.784 0.823 0.948 1.016 0.812 0.877 0.926
HART+ 0.616 1.048 1.033 1.115 0.986 0.978 0.953

If Armstrong could ever get healthy and stay healthy he might actually be a useful player.  He has shown some offensive ability in the past and he can be a physical energy player which is something the Leafs desperately need.  Unfortunately his health is a big if.  I am not against having him on the Leafs next season but I would lump him with Lombardi and Connolly.  If you can move him, you do.  The Leafs really need to shed at least 2 of those contracts in order to free up cap space to fill the holes elsewhere.

Mike Brown

2011-12 2010-11 2009-10 2008-09 2007-08 2009-12 (3yr) 2007-12 (5yr)
HARO+ 0.488 0.900 0.712 0.421 0.269 0.682 0.521
HARD+ 0.878 1.202 1.169 0.896 0.975 1.094 1.035
HART+ 0.683 1.051 0.941 0.659 0.622 0.888 0.778

Mike Brown has very little offensive ability, but he is defensively aware and is a guy who will throw his body around and stand up for his teammates.  He wasn’t a Ron Wilson type of player but I think you might see his role expanded a little under Randy Carlyle.  I am perfectly happy seeing him on the fourth line again next year.

Joey Crabb

2011-12 2010-11 2009-10 2008-09 2007-08 2009-12 (3yr) 2007-12 (5yr)
HARO+ 1.063 1.139 1.058 1.046
HARD+ 0.977 0.969 0.808 0.995
HART+ 1.020 1.054 0.933 1.020

Joey Crabb deserves a lot of credit for really earning himself a roster spot.  He is a hard worker who will chip in offensively and seems to be at least reasonably defensively aware and perfectly capable of playing on almost any line as an injury fill in as needed.  I am not sure he is the kind of guy I’d write in as the permanent second line winger or permanent third line winger, but rather I’d continue to use him as he has been used the past couple seasons – the ideal 13th forward that actually plays a lot as he is the primary injury fill in regardless of which line the injured forward plays on though one could see him as a third line regular too.  He is a UFA but can probably be easily re-signed.

Nazem Kadri

2011-12 2010-11 2009-10 2008-09 2007-08 2009-12 (3yr) 2007-12 (5yr)
HARO+ 1.093 0.741 0.836
HARD+ 1.595 1.436 1.469
HART+ 1.344 1.088 1.152

It sometimes irks me how Kadri gets criticized by Leaf management for not playing a complete game when a) so many other players on the roster do not seem to be expected to play to that same standard and b) statistically speaking he doesn’t appear to be a liability defensively.  It is time for the Leafs to give Kadri a full time role in the NHL and see what he can do.  Second line duy

Matt Frattin

2011-12 2010-11 2009-10 2008-09 2007-08 2009-12 (3yr) 2007-12 (5yr)
HARO+ 1.001
HARD+ 1.020
HART+ 1.011

Frattin had a good rookie season as a defensively aware player who can chip in offensively from time to time.  I think he deserves full time 3rd line duty next year.

Jay Rosehill

2011-12 2010-11 2009-10 2008-09 2007-08 2009-12 (3yr) 2007-12 (5yr)
HARO+ 0.159 0.628 0.899 0.463
HARD+ 1.318 0.569 0.653 0.832
HART+ 0.739 0.598 0.776 0.647

Yeah, he’s not good.  I’ll give him credit for his willingness to drop the gloves when asked to, but he is really a second rate fighter who can’t really get the job done as a player.

Colton Orr

2011-12 2010-11 2009-10 2008-09 2007-08 2009-12 (3yr) 2007-12 (5yr)
HARO+ 0.507 0.749 0.405 0.225 0.688 0.398
HARD+ 1.375 1.129 0.815 1.018 1.234 1.054
HART+ 0.941 0.939 0.610 0.622 0.961 0.726

Colton Orr actually seems like he might not be a huge defensive liability, at least if you play him in a fourth line role under protected minutes.  We’ve probably seen the end of Orr on the Leafs but you never know.  I am pretty sure Burke will be looking for a heavy weight who can contribute to add to the roster, but failing that maybe Orr is the guy.  Wouldn’t shock me.

Where does that leave us?

So, with the player evaluations complete, where does that leave us as far as a roster goes.  Well, if I had my choice I would like to see the following lines next season:

Left Wing Center Right Wing
Lupul  ???? Kessel
MacArthur Grabovski Kadri
Kulemin  ???? Frattin
Brown Steckel Crabb

Hopefully one way or another are Connolly, Lombardi, and Armstrong are gone but that may be a tall order.  I wouldn’t be surprised if one of them gets bought out (Lombardi most likely) and if healthy maybe Armstrong can find a role on the team but realistically at least 2 of those contracts need to be shed if the Leafs are going to have the cap space to fill the holes at #1C and #3C.

For the #1 center role I would be looking for at least an established 60 point 2-way center, ideally with good speed and at least a little size and strength.  The #1 center hole is probably most likely to be addressed via trade as I don’t see that type of player in the free agent pool.  Travis Zajac has had a tough year with injuries but if the Leafs could somehow pry him away from the Devils I’d be more than happy with him in the #1C role.  He isn’t a big time offensive player, but has played a top line role with the Devils and knows how to play a defensively aware game and with the emergence of Adam Henrique and financial woes of Devils ownership they may be looking for a cheaper option than Zajac provides.  There is also the possibility of this years draft pick at some point becoming the #1C, but I think realistically we are at least a year or two away from that.

The #3C I would be looking for a solid defensive center with good speed and decent size and if he can contribute 30+ or so points that would be ideal.  With Kulemin and Frattin on the wings, both with good size and speed and some offensive ability and solid defensive awareness, you could have a perfect third line to match up against opposing top lines.  Jarrett Stoll is a UFA who might fit the bill, as might Paul Gaustad (he’d definitely add the size Burke is looking for).  Samuel Pahlsson  is also also a UFA and Burke and Carlyle are both very familiar with him, but at age 34 might be a bit older than they are looking for.

Notice that I have filled in all of the winger positions.  I am not against making a trade to improve at the wing positions by adding more size (most likely McArthur’s) but I’d only do so after filling in the holes at #1C and #3C. Furthermore, I really hope it is not Rick Nash as I think it will cost too much to acquire him I don’t believe he is as good as many think he is and his contract is long and very large.

Joe Colborne and Carter Ashton are two more young players who may need more development time but may challenge for a job or at least be injury fill-in candidates.  I didn’t see much from Ashton’s 15 games late in the season to tell me that he is ready for a regular job in the NHL and Colborne has had an up and down season with the Marlies.

Well, that is the Leaf forwards for you.  In my next post I’ll take a look at the defense and goaltending.

 

Nov 222011
 

I hate to keep beating the “Shooting Percentage Matters” drum but it really dumbfounds me why so many people choose to ignore it, or believe it is only a small part of the game and not worth considering and instead focus their attention on corsi/fenwick, and corsi/fenwick derived stats as their primary evaluation too.

It dumbfounds me that people don’t think players have an ability to control shooting percentage yet we all seem to agree that shooting percentage is affected by game score.  Rob Vollman wrote the following in a comment thread at arctic ice hockey.

<blockqote>The score can affect the stats because teams behave differently when chasing or protecting a lead…</blockquote)

He isn’t specifically referring to shooting percentage, but shooting percentage varies based on game score and I think most people accept that.  So, while people freely accept that teams can play differently depending on score, they seemingly choose not to believe that players can play different depending on their role, or skillset.  Or rather, it isn’t that they don’t believe players can play differently (for example they realize there are defensive specialists) they just choose not to accept that a players style of play (in addition to their talents, which often dictates their style of play) will affect their stats, including shooting percentage.  An example, which I brought up at The Puck Stops Here is Marian Gaborik vs Chris Drury.  Both Gaborik and Drury played the past 2 seasons on the NY Rangers but Gaborik played an offensive role and Drury generally played a more defensive/3rd line role.  As a result, here are their offensive stats at 5v5 over the past 2 seasons.

Gaborik Drury Gaborik’s Edge
Team Fenwick For per 20min WOI 13.8 12.8 +8%
Team Sh% For WOI 10.26% 6.18% +66%
Team Goals For per 20 min WOI 1.031 .575 +79%

Shooting percentage took what was a slight edge for Gaborik in terms of offensive fenwick for and turned it into a huge advantage in goals for.  Part of that is Gaborik and his line mates better skill level and part of it is their aggressive offensive style of play, but regardless of why, we need to take shooting percentage into account or else we will undervalue Gaborik at the offensive end of the rink and over value Drury.

It isn’t just Gaborik and Drury whose offense is significantly impacted by shooting percentage.  It happens all the time.  I took a look at all players that had 2000 5v5 even strength on-ice offensive fenwick events over the past 4 seasons.  From there I calculated their expected on-ice goals scored based on their ice time using league-wide average  on-ice fenwick for per 20 minutes (FF20) and league-wide average fenwick shooting percentage (FSH%).

I next calculated an expected goals based on the league-wide FF20 and the players FSH% as well as an expected goals based on the players FF20 and the league-wide average FSH%.  When we compare these expected goals to the expected goals based solely on the league-wide average we can get an idea of whether a players on-ice goal production is driven mostly by FF20 or FSH% or some combination of the two.

The following players had their on-ice 5v5 goal production influenced the most positively or most negatively due to their on-ice 5v5 FSH%.

Player Name %Increase from FSH%
MARIAN GABORIK 40.6%
SIDNEY CROSBY 36.3%
ALEX TANGUAY 33.1%
HENRIK SEDIN 32.8%
BOBBY RYAN 32.5%
EVGENI MALKIN 31.9%
DANIEL SEDIN 31.6%
ILYA KOVALCHUK 30.6%
NATHAN HORTON 29.6%
J.P. DUMONT 29.4%
GREGORY CAMPBELL -12.4%
RYAN CALLAHAN -13.9%
RADEK DVORAK -15.6%
CHRIS DRURY -16.8%
SEAN BERGENHEIM -19.4%
SCOTT GOMEZ -19.7%
MARTIN HANZAL -21.5%
MIKE GRIER -21.5%
DANIEL WINNIK -24.5%
TRAVIS MOEN -32.1%

And the following players had their on-ice 5v5 goal production influenced the most positively or most negatively due to their on-ice 5v5 FF20.

Player Name %Increase from FF20
HENRIK ZETTERBERG 24.7%
ALEX OVECHKIN 21.7%
PAVEL DATSYUK 20.6%
TOMAS HOLMSTROM 19.9%
NICKLAS BACKSTROM 19.8%
ERIC STAAL 19.7%
RYANE CLOWE 18.8%
ALEXANDER SEMIN 18.3%
SCOTT GOMEZ 18.0%
ZACH PARISE 17.9%
MARTY REASONER -6.5%
ANDREW COGLIANO -6.5%
ANTTI MIETTINEN -6.7%
KYLE BRODZIAK -7.3%
CHRIS KELLY -8.6%
ILYA KOVALCHUK -9.8%
JAY MCCLEMENT -10.4%
MICHAL HANDZUS -14.4%
JOHN MADDEN -14.5%
TRAVIS MOEN -15.6%

Some interesting notes:

  1.  The range in the influence of FSH% is significantly larger than the range of influence of FF20 indicating that shooting percentage is more important than shot generation in terms of scoring goals.
  2. The FSH% list is not random.  The list is stratified.  Offensive players at the top, non-offensive players at the bottom (plus Scott Gomez who gets offensive minutes, but sucks).  What you see above is not luck.  There is order to the list, not randomness.
  3. Speaking of Gomez, he sucks at on-ice FSH%, but has a very good FF20, though that is partly due to offensive zone start bias.
  4. Ilya Kovalchuk is the anti-Gomez.  He has a great FSH%, but is horrible at helping his team generate shots.
  5. The standard deviation of the FSH% influence is 14.5% while it is 8.3% for FF20 influence so it seems FSH% has a much greater influence on scoring goals than FF20.  This is not inconsistent with some of my observations in the past or observations of others.

So, what does all this mean?  Shooting percentage matters, and matters a lot and thus drawing conclusions based solely on a corsi analysis is flawed.  It isn’t that generating shots and opportunities isn’t important, but that being great at it doesn’t mean you are a great player (Gomez) and being bad at it doesn’t make you a bad player (Kovalchuk).  For this reason I really cringe when I see people making conclusions about players based on a corsi analysis.  A corsi analysis will only tell you how good he is at one aspect of the game, but is not very good at telling you the players overall value to his team.  My goal is, and always will be, to try and evaluate a players overall value and this is why I really dislike corsi analysis.  It completely ignores a significant, maybe the most significant, aspect of the game.  Furthermore, I believe that offensive ability and defensive ability should be evaluated separately, which many who do corsi analysis don’t do or only partially or subjectively do.

I really don’t know how many different ways I can show that shooting percentage matters a lot but there are still a lot of people who believe players can’t drive or suppress shooting percentage or believe that shooting percentage is a small part of the game that is dwarfed by the randomness/luck associated with it (which is only true if sample size is not sufficiently large).  The fact is corsi analysis alone will never give you a reliable (enough to make multi-million contract offers) evaluation of a players overall ability and effectiveness.  Shooting percentage matters, and matters a lot.  Ignore at your peril.

 

Oct 272011
 

There has been a fair bit of discussion going on regarding shot quality the past few weeks among the hockey stats nuts.  It started with this article about defense independent goalie rating (DIGR) in the wall street journal and several others have chimed in on the discussion so it is my turn.

Gabe Desjardins has a post today talking about his hatred of shot quality and how it really isn’t a significant factor and is dominated by luck and randomness.  Now, generally speaking when others use the shot quality they are mostly talking about thinks like shot distance/location, shot type, whether it was on a rebound, etc.  because that is all data that is relatively easily available or easily calculated.  When I talk shot quality I mean the overall difficulty of the shot including factors that aren’t measurable such as the circumstances (i.e. 2 on 1, one timer on a cross ice pass, goalie getting screened, etc.).  Unfortunately my definition means that shot quality isn’t easily calculated but more on that later.

In Gabe’s hatred post he dismisses pretty much everything related to shot quality in one get to the point paragraph.

 

Alan’s initial observation – the likelihood of a shot going in vs a shooter’s distance from the net – is a good one.  As are adjustments for shot type and rebounds.  But it turned out there wasn’t much else there.  Why?  The indispensable JLikens explained why – he put an upper bound on what we could hope to learn from “shot quality” and showed that save percentage was dominated by luck.  The similarly indispensable Vic Ferrari coined the stat “PDO” – simply the sum of shooting percentage and save percentage – and showed that it was almost entirely luck.  Vic also showed that individual shooting percentage also regressed very heavily toward a player’s career averages.  An exhaustive search of players whose shooting percentage vastly exceeded their expected shooting percentage given where they shot from turned up one winner: Ilya Kovalchuk…Who proceeded to shoot horribly for the worst-shooting team in recent memory last season.

So, what Gabe is suggesting is that players have little or no ability to generate goals aside from their ability to generate shots.  Those who follow me know that I disagree.  The problem with a lot of shot quality and shooting percentage studies is that sample sizes aren’t sufficient to draw conclusions at a high confidence level.  Ilya Kovalchuk may be the only one that we can say is a better shooter than the average NHLer with a high degree of confidence, but it doesn’t mean he is the only one who is an above average shooter.  It’s just that we can’t say that about the others at a statistically significant degree of confidence.

Part of the problem is that goals are very rare events.  A 30 goal scorer is a pretty good player but 30 events is an extremely small sample size to draw any conclusions over.  Making matters worse, of the hundreds of players in the NHL only a small portion of them reach the 30 goal plateau.  The majority would be in the 10-30 goal range and I don’t care how you do your study, you won’t be able to say much of anything at a high confidence level about a 15 goal scorer.

The thing is though, just because you cannot say something at a high confidence level doesn’t mean it doesn’t exist.  What we need to do is find ways of increasing the sample size to increase our confidence levels.  One way I have done that is to use 4 years of day and instead of using individual shooting percentage I use on-ice shooting percentage (this is useful in identifying players who might be good passers and have the ability to improve their linemates shooting percentage).  Just take the list of forwards sorted by on-ice 5v5 shooting percentage over the past 4 seasons.  The top of that list is dominated by players we know to be good offensive players and the bottom of the list is dominated by third line defensive role players.  If shooting percentage were indeed random we would expect some Moen and Pahlsson types to be intermingled with the Sedin’s and Crosby’s, but generally speaking they are not.

A year ago Tom Awad did a series of posts at Hockey Prospectus on “What Makes Good Players Good.”  In the first post of that series he grouped forwards according to their even strength ice time.  Coaches are going to play the good players more than the not so good players so this seems like a pretty legitimate way of stratifying the players.  Tom came up with four tiers with the first tier of players being identified as the good players.  The first tier of players contained 83 players.  It will be much easier to draw conclusions at a high confidence level about a group of 83 players than we can about single players.  Tom’s conclusions are the following:

The unmistakable conclusions from this table? Outshooting, out-qualitying and out-finishing all contribute to why Good Players dominate their opponents. Shot Quality only represents a small fraction of this advantage; outshooting and outfinishing are the largest contributors to good players’ +/-. This means that judging players uniquely by Corsi or Delta will be flawed: some good players are good puck controllers but poor finishers (Ryan Clowe, Scott Gomez), while others are good finishers but poor puck controllers (Ilya Kovalchuk, Nathan Horton). Needless to say, some will excel at both (Alexander Ovechkin, Daniel Sedin, Corey Perry). This is not to bash Corsi and Delta: puck possession remains a fundamental skill for winning hockey games. It’s just not the only skill.

In that paragraph “shot quality” and “out-qualitying” is used to reference a shot quality model that incorporates things like shot location, out-finishing is essentially shooting percentage, and outshooting is self-explanatory.  Tom’s conclusion is that the ability to generate shots from more difficult locations is a minor factor in being a better player but both being able to take more shots and being able to capitalize on those shots is of far greater importance.

In the final table in his post he identifies the variation in +/- due to the three factors.  This is a very telling table because it tells it gives us an indication of how much each factors into scoring goals.  The following is the difference in +/- between the top tier of players and the bottom tier of players:

  • +/- due to Finishing:  0.42
  • +/- due to shot quality:  0.08
  • +/- due to out shooting:  0.30

In percentages, finishing ability accounted for 52.5% of the difference, out shooting 37.5% of the difference and shot quality 10% of the difference.  Just because we can’t identify individual player shooting ability at a high confidence level doesn’t mean it doesn’t exist.

If we use the above as a guide, it is fair to suggest that scoring goals is ~40% shot generation and ~60% the ability to capitalize on those shots (either through shot location or better shooting percentages from those locations).  Shooting percentage matters and matters a lot.  It’s just a talent that is difficult to identify.

A while back I showed that goal rates are better than corsi rates in evaluating players.  In that study I showed that with just 1 season of data goal for rates will predict future goal for rates just as good as fenwick for rates can predict future goal for rates and with 2 years of data goal for rates significantly surpass fenwick for rates in terms of predictability.  I also showed that defensively, fenwick against rates are very poor predictors of future goal against rates (to the point of uselessness) while goals against rates were far better predictors of future goal against rates, even at the single season level.

The Conclusion:  There simply is no reliable way of evaluating a player statistically at even a marginally high confidence level using just a single year of data.  Our choices are either performing a Corsi analysis and doing a good job at predicting 40% of the game or performing a goal based analysis and doing a poor job at predicting 100% of the game.  Either way we end up with a fairly unreliable player evaluation.  Using more data won’t improve a corsi based analysis because sample sizes aren’t the problem, but using more data can significantly improve a goal based analysis.  This is why I cringe when I see people performing a corsi based evaluation of players.  It’s just not, and never will be, a good way of evaluating players.