Jul 122011
 

Over the past couple of weeks I have had several comment discussions regarding some of my recent posts on player evaluation and Norris and Hart trophy candidates which centered around which is a better method for evaluating players:  corsi vs goal based evaluation.  A lot of people, maybe the majority of those within the advanced hockey stat community, seem to prefer corsi based analysis while I prefer goal based analysis and I hope to explain why with this post.  I have explained much of this previously but hopefully this post will put it all into one simple easy to understand package.

There are two main objectives for a player when the coach puts him on the ice:  1.  Help his team score a goal.  2.  Help his team stop the opposing team from scoring a goal.  Depending on the situation and the player the coach may prioritize one of those over the other.  For example, a defensive player may be tasked primarily with shutting down an opposing teams offensive players and scoring a goal is really a very minor objective.  Late in a game when a team is down a goal the opposite is true and the primary objective, if not sole objective, is to score a goal.

I think we can all agree on the previous paragraph.  Goals are what matter in hockey so right there we have the #1 reason why goals should be used in player evaluation.  The problem is, goals are a relatively rare event and thus ‘luck’ can have a serious impact on our player analysis results due to the small sample size that goals provide.  This brought on the concept of corsi which is nothing more than shot attempts and is used as a proxy for scoring chances.  The benefit of corsi is that shot attempts occur about 10 times often as goals which gives us a larger sample size to evaluate players.

I think everyone should agree on everything I have written above.  The stuff below is where people disagree (mostly disagree with me). (Note:  I am going to shift here from corsi to fenwick.  Corsi is shots + missed shots + blocked shots while fenwick is shots + missed shots.  They are extremely highly correlated and in most ways interchangeable but I have done my work with fenwick so that is what I discuss below) The problem is, fenwick does not correlate well with goals, especially when compared to shooting percentage.

Comparison Forwards Defensemen
FF20 vs GF20 0.465 0.315
FenSh% vs GF20 0.827 0.607
FA20 vs GA20 0.221 0.199
FenSv% vs GA20 0.631 0.674
FF% vs GF% 0.349 0.325
FenPDO vs GF% 0.641 0.549

The above table is calculated using players who have played 500 minutes in each of the past 4 seasons and uses the full 4 seasons of data.  FF20 and FA20 are fenwick for and fenwick against per 20 minutes of ice time.  GF20 and GA20 are goals for and goals against per 20 minutes of ice time.  FenSh% is fenwick shooting percentage while on ice (GF/FF) and FenSv% is fenwick save percentage while on ice (GA/FA).  FF% is FF/(FF+FA), GF% is GF/(GF+GA) and FenPDO is FenSh% + FenSv%.  The numbers are r^2.

Clearly from the above table goal rates are far more highly correlated with shooting/save percentages than they are with fenwick rates.  Generally speaking, players with better shooting percentages score goals more frequently than players that shoot more frequently.  Similarly, players with better save percentages when they are on the ice give up fewer goals than players that allow fewer shots against.  In summary, capitalizing on opportunities has a greater impact on scoring goals than opportunity generation. I am not sure everyone seems to agrees with this but I don’t see how you can come to any other conclusion.  So my question is, if shooting percentage is so important, why would we ignore it by just looking at corsi?

This is where people bring up the persistence argument.  They argue that from year to year corsi is far more persistent (i.e. someone with a high corsi one year will have a high corsi the following year) than save percentage or goal production.  They will argue that the luck influence on the smalls ample size of goal scoring dominates over any talent differences among the players and thus makes studying save percentages and goal production useless.  This may surprise some, but I agree 100%.  When the sample sizes are small enough luck dominates over differences of talent and corsi will do a better job at player evaluation.  Where we disagree is what a small sample size is.

You see, a lot of studies into this don’t even look at year over year stats and rather look at first half/second half comparisons of a single season or even/odd games of a single season and many of them don’t have very stringent restrictions on ice time.  At best under these scenarios you are comparing performances over 41 games and maybe 20 minutes of 5v5 ice time.  Under these scenarios I agree, using shooting percentages or goal production to evaluate players is useless because it is so luck driven.  What if we increased the sample size?  What if we looked at a full season vs the following season, or 2 years vs the following 2 years?

So that is what I did.  I looked at year over year and 2 year over 2 year comparisons to see how corsi stats and goal stats can predict future goal stats.  I also restricted my study to players with 500 minutes of 5v5 ES ice time in each of the last 4 seasons (or an average of about 6 minutes over 82 games played).  These are the r^2 results I got split by forwards and defense.

Forwards:

Year(s) vs Year(s) FF20 to GF20 GF20 to GF20 FA20 to GA20 GA20 to GA20 FF% to GF% GF% to GF%
2007-08 to 2008-09 0.157 0.149 0.070 0.080 0.1243 0.0698
2008-09 to 2009-10 0.188 0.219 0.001 0.011 0.093 0.0648
2009-10 to 2010-11 0.267 0.241 0.014 0.045 0.123 0.1208
Average 0.204 0.203 0.029 0.046 0.113 0.085
2007-09 vs 2009-11 0.248 0.389 0.001 0.018 0.0879 0.0925

Defense:

Year(s) vs Year(s) FF20 to GF20 GF20 to GF20 FA20 to GA20 GA20 to GA20 FF% to GF% GF% to GF%
2007-08 to 2008-09 0.043 0.048 0.050 0.017 0.0233 0.0039
2008-09 to 2009-10 0.063 0.002 0.066 0.024 0.1674 0.0144
2009-10 to 2010-11 0.110 0.001 0.048 0.073 0.1148 0.0605
Average 0.072 0.017 0.055 0.038 0.102 0.026
2007-09 vs 2009-11 0.083 0.057 0.045 0.030 0.1132 0.0299

Here are some observations:

1.  For forwards, year over year comparisons are quite similar for fenwick to goals as for goals to goals.  In other words, for 1 year of data there isn’t a significant difference between using goal based predictors or corsi based predictors to predict future offensive performance.  But, when we increase to 2 years, using goal based stats as the predictor is far better (and one can assume that using 3 or even 4 years would be even better).

2.  Using either corsi against or goals against stats to predict next years corsi against or goal against stats is completely useless, even under 2 year vs 2 year scenarios.  This in turn makes predicting following seasons corsi and goal ratios mostly useless.

3.  For defensemen, it is pretty much useless to predict following season offense or defense, but if any predictability exists, it might exist slightly better for corsi as the predictor.

What all this tells me is that individual forwards can drive offense but the majority defensemen can’t.  Furthermore, generally speaking, individual players can’t influence team defense.  Now this isn’t to say that some defensemen can’t influence offense or some players can’t influence team defense, but the majority of them can’t.  The majority of them are are mostly indistinguishable from each other and have stats that are more a circumstance of who they play with and against.

Now if we go back to 4 year 2007-11 data and take a look at the variation in GF20 and GA20 data for forwards and defensemen with >3000 minutes of ice time we should see much more variance in GF20 for forwards than GA20 for forwards, GF20 for defenseman and GA20 for defensemen.  Not surprisingly we do as shown in the following chart.

The above chart shows the GF20 and GA20 for forwards and defensemen sorted from highest to lowest to better show the distribution of observations.  For those more number inclined, here are the mean and standard deviations for you.

GF20

(Forwards)

GA20

(Forwards)

GF20

(Defense)

GA20

(Defense)

Mean 0.844 0.775 0.776 0.764
Stdev 0.142 0.084 0.081 0.080

There is a lot more variance in a forwards offensive performance than a forwards defensive performance or a defensemans offensive or defensive performance.  This should confirm the conclusions drawn above in the year vs year predictability study and this is consistent with another study I did a while back.

For interest sake, lets look at the same chart and table for fenwick observations.

FF20 (Forwards) FA20 (Forwards) FF20 (Defense) FA20 (Defense)
Mean 13.88 13.41 13.45 13.41
Stdev 1.13 0.87 0.88 0.85

Like goals rates, there is some greater variance in fenwick for rates for forwards, but fenwick against for forwards and fenwick for and against for defensemen have all more or less the same mean and standard deviation, much like what we saw with goal data.

Ok, so what does this all mean?  Here are a few things to keep in mind.

The More the Better:  More data is better.  2 years of data is certainly better than one year of data and most likely 3 or even 4 years of data is even better.

Goal data is better than Corsi data:  When we have more than a years worth of data, goal based data is certainly a better indicator of a players talent level.

Forwards drive Offense:  The biggest variation in talent levels is the offensive ability of forwards.  Few defensemen have a significant impact on offensive performance of his team.

Team Defense:  Generally speaking, defense is more of a team thing than an individual thing.  Few individual players have a significant impact, positively or negatively, on the defensive side of the game.

An individuals goals for and against stats is only part of the story though.  It is also important to take into account quality of teammates and quality of opponents.  I have attempted to do this by looking at the goals for and against rates of a players teammates and his opponents when they are not on the ice at the same time as the player being evaluated so the player himself doesn’t influence his own quality of teammates and quality of competition values.  I then use an iterative technique to standardize player ratings across the league and have come up with a set of hockey analysis ratings which can be seen at stats.hockeyanalysis.com.

HARO+ – Hockey Analysis Rating – Offense

HARD+ – Hockey Analysis Rating – Defense

HART+ – Hockey Analysis Rating – Total (average of HART+ and HARD+)

The + indicates it was derived from the iterative process.  You can also find HARO, HARD and HART ratings at stats.hockeyanalysis.com which adjust for quality of teammates and competition but not through an iterative process and thus, in theory anyway, are not as good.  For these ratings, a rating of 1.00 indicates when the player is on the ice his performance should be similar to league average performance.  For example, if a player had a HARO+ of 1.00 one should expect he will contribute goals at approximately the league average rate.  Anything above 1.00 is better than average and anything below 1.00 is below average.  Because good players get more ice time and thus pull up the league average goal rates, far more players are below 1.00 than above.

So, how do I use all this information to evaluate players?  If I were asked to evaluate a player this is the process I perform, and I’ll use recent UFA signing Brad Richards as an example.

1.  First I’ll take a look at the 5v5 ratings for all players at his position over the past 3 or 4 seasons.  So for Brad Richards I’ll take a look at all forwards with >2000 minutes of 5v5 ice time over the past 4 seasons and sort by HART+ and see where Richards ranks to get a general idea of his performance.  Brad Richards ranks 196th of 310 forwards with a HART+ of 0.931 so not so good.

2.  I’ll then take a look at his HARO+ and HARD+ ratings to see if he is stronger at any particular aspect of the game.  For Richards his 4 year HARO+ rating is 1.03 and his 4 year HARD+ rating is 0.833 so he is definitely a stronger offensive player than a defensive one.  I’ll sometimes sort by HARO+ or HARD+ to see where he ranks as well.  He ranks 87th in HARO+ and 301 in HARD+ so he is OK offensively and one of the worst defensively.

3.  Next I’ll take a look at Brad Richard’s year by year stats by looking at his player card on stats.hockeyanalysis.com.  I’ll take a look at how his single season (or 2-year) HARO+ and HARD+ ratings have varied over the past 4 seasons to get an idea of his consistency.  For Richards his HARO+ ratings over the past 4 seasons were 0.882, 1.134, 1.267 and 1.300 from 2007-08 to 2010-11.  So clearly his 2007-08 season was an anomaly and he is probably a better offensive player than his 4 year HARO+ indicates.  I would then take a look at his 3 year HARO+ rating (2008-09 to 2010-11) and notice he has a rating of 1.203 which is quite good.  His HARD+ ratings over the past 4 seasons are 0.784, 0.738, 0.744, 1.051.  It seems last year was a bit of an anomaly for him on the defensive side of the game so unless he has suddenly learned (or chosen) to play defense one can likely expect his defensive numbers to fall dramatically next season.  Overall though it is probably safe to conclude that Richards is a good to very good offensive player and a very weak defensive player.

4.  Depending on the player I may then look up his 4 year 5v4 power play HARO+ ratings or his 4v5 penalty kill HARD+ ratings to determine his performance on the PP and PK.  For Richards his PP HARO+ rating is 1.182 which is good and his 4v5 PK HARD+ rating is 0.678 which is weak.  It is also important to take a look at a players PP or PK ice time to see how relevant these ratings might be.

5.  Every now and again I’ll want to drill down even further and just look at a players 5v5 game tied ratings or up a goal or down a goal ratings (all of which can be found at stats.hockeyanalysis.com).  For example, Brad Richards 4-year game tied ratings are 1.099 for HARO+, 0.808 for HARD+ for a HART+ of 0.954 which are fairly consistent with his overall 5v5 ratings.

6.  Often I’ll want to take a closer look at a players quality of competition to see how tough a players minutes are.  I’ll do this by sorting 4 year data by OppGF20 or OppGA20 to see if the player is playing against exceptionally offensive or exceptionally defensive players.  Doing so you’ll find that Brad Richards ranked 239th in OppGF20 so he didn’t play against exceptionally talented offensive players but ranked 23rd in OppGA20 so he played against very good defensive opponents.

And generally speaking, that is how I evaluate players.  For players with fewer years in the league I use as much data as I can.  For players with fewer than 2, and certainly 1, year in the league I’ll take the players ratings with a grain of salt knowing that they can be unreliable.  For young players I’ll take special notice of any upward trends in their seasonal ratings to get an indication of whether they are improving as players and for older players I’ll take notice of any downward trends to see if their careers are on the decline.

For interest sake, I also have corsi and fenwick ratings at stats.hockeyanalysis.com so you can see a how a player performs according to their corsi or fenwick stats.  For Brad Richards, his FenHARO+ is 0.913 which indicates he is a weak offensive player and his FenHARD+ rating is 0.993 which indicates he is about an average defensive player.  This is a very different story than the goal ratings told us.  I know which I have more faith in.

And that is how I evaluate players.

  5 Responses to “How I Evaluate Players (and Why)”

  1.  

    It looks to me that for forwards it was only GA that had a better r^2 with goals than with Fenwick. The GF was essentially equal .203 vs .204. Am I missing something? You proved that goals are just as good at predicting goals as Fenwick is in a 4 yr sample but not that they are better at predicting it. Of course extending the sample even further may make goals a better predictor but at 4yrs it is only ‘just as good’ not better.

    •  

      It looks to me that for forwards it was only GA that had a better r^2 with goals than with Fenwick. The GF was essentially equal .203 vs .204.

      Using one year vs one year, yes, they are essentially equal. But using 2 year 2007-09 data to predict 2-year 2009-11 data GF20 is far better than FF20 at predicting future GF20 (0.389 vs 0.248). One can presume that at 3 years it is even better (though I don’t have 6 years of data handy to do that) and maybe 4 years would be even better, etc. (though at some point career progression would begin to affect things)

  2.  

    Hi David
    Great blog. I love your very comprehensive stats site.

    I am a little confused about the results of 2 of your studies which appear contradictory to me. In a table above you show an average year over year correlation of only 0.046 for forwards GA20 to GA20. This of course implies virtually no predictive power. However your blog entry “Goal rates better than Corsi…” dated May 30th has a defensive table that shows the average GA20 to GA20 correlation to be 0.462. This implies significantly more predictive power. I believe that in both studies you are using forwards with at least 500 mins EV 5vs5 ice time per year over the last 4 years. I would be very appreciative you could clarify the difference

    Thanks
    Rick

    •  

      Hi Rick. Thanks for the comment. I took a quick look and the short answer is I don’t know. It could be a mistake. I’ll have to go back and check my work again and I’ll let you know. Give me a day or two.

  3.  

    I’ve briefly looked over your system before and I think it has some merit and if done correctly *might* be able to fairly and accurately evaluate a player with a smaller data set (i.e. fewer games). The reason being is using your method you are assigning credit or blame directly to the players involved. My statistical method simply credits or gives blame to everyone on the ice, regardless of whether they were directly involved in the play. The expectation is, over time, players who actually drive results will slowly rise to the top, or fall to the bottom.

    The problem I have with your assign credit or blame method is two fold.

    1. The amount of blame to assign is subjective. You may be able to assign blame for a goal against to a couple players directly involved in the play, but shouldn’t the player who turned the puck over in the neutral zone allowing the opposition to enter the offensive zone in the first place take some blame too, even if the goal took place 20 seconds later? Not all goals result only from the 3-5 seconds of play immediately before the goal was scored.

    2. Generally speaking I have little faith in humans being reliable recording devices, especially when multiple humans are involved. It is one thing for Roger Nielson to record scoring chances for his own players for every game he watches but it is another thing to expect 30 different people watching games for 30 different teams to record scoring chances in exactly the same manner over the course of 82 games. Just look at the hits/giveaway/takeaway stats on nhl.com. They are so flawed that they are all but useless. Without reliable data for all 30 teams it is difficult to take into account quality of teammates and quality of competition.

    What I like about using a goal based method is there is no bias involved whatsoever. A goal is a goal is a goal. We know it is a goal because it went in the net, and we even verified it by video replay. I believe some people have even speculated that shot data is not 100% bias free, but goals are. Furthermore, that guy who turned the puck over at center ice gets partial blame. If turning the puck over is a frequent problem for him, it will eventually show up in his stats. The downside of course is it takes probably 2 years to get somewhat reliable results, and 3 or 4 years is probably best.

Sorry, the comment form is closed at this time.