Persistence and Predictability
There seems to be some confusion, or lack of clarity, about my post on corsi vs shooting percentage vs shooting rate the other day so let me clear it up in as straight forward a way as I can.
“Hawerchuk” over at BehindTheNetHockey.com writes the following:
“I’m not totally sure what he’s getting at. People use Fenwick because it’s persistent, and PDO because it’s not. Over the course of a single season, observed shooting and save percentage drive results, but they are not persistent.”
Dirk Hoag over at OnTheForecheck.com writes:
“Here’s an example of when NOT to use correlation as a tool in statistical analysis (when the variables in question are linked by definition). David makes a bad blunder here, by looking at scoring leaders, seeing a bunch of high shooting percentages, and concluding that shooting percentage is the true “talent”. The problem is that shooting percentage swings wildly from season to season, whereas shooting rates are much more consistent.”
The great advantage of corsi/fenwick has over goals as an evaluator of talent is the greater sample size associated with it. The greater the sample size the more confidence we can have in any results we conclude from it and the less chance that ‘luck’ messes things up. Year over year shooting percentage fluctuates a lot, but that doesn’t necessarily mean that it isn’t a talent or doesn’t have persistence, it could mean that the sample size of one year is too small. The four year shooting percentage leader board seems to identify all the top offensive players so it can’t be completely random. So what happens if we increase the sample size? Here are correlations of fenwick shooting percentages while on ice in 5v5 even strength situations for forwards:
|Year(s) vs Year(s)||Corrolation|
|200708 vs 200809||0.249|
|200809 vs 200910||0.268|
|200910 vs 201011||0.281|
|200709 vs 200911 (2yr)||0.497|
As you can see, there isn’t a lot of persistence year over year but for 2 years over 2 years we are starting to see some persistence. Still not to the level of corsi/fenwick, but certainly not non-existant either, and the greater correlation with scoring goals makes fenwick shooting percentage on par with fenwick as a predictor of future goal scoring performance when we have 2 seasons of data as I pointed out in my last post.
For the record, year over year correlation for fenwick for rate is approximately 0.60 depending on years used and 2 year vs 2 year correlation is 0.66.
But as I pointed out in my previous post, you would probably never use shooting percentage as a predictor because you may as well use goal rate instead which has the same sample size limitations as shooting percentage but also factors in fenwick rate. Year over year correlation of GF20 (goals for per 20 minutes) is approximately 0.45 depending on years used and the 2 year vs 2 year correlation is 0.619 so GF20 has persistence and has a 100% correlation with itself making it as reliable (or more) a predictor of future goal scoring rates as fenwick rate with just one year of data and a better predictor when using 2 years of data. Let me repost the pertinent table of correlations:
|Year(s) vs Year(s)||FenF20 to GF20||GF20 to GF20|
|200708 vs 200809||0.396||0.386|
|200809 vs 200910||0.434||0.468|
|200910 vs 201011||0.516||0.491|
|200709 vs 200911 (2yr)||0.498||0.619|
|200709 vs 200910 (2yr vs 1yr)||0.479||0.527|
The conclusion is, when dealing with less than a years worth of data, fenwick/corsi is probably the better metric to identify talent and predict future performance, but anything greater than a year goals for rate is the better metric and for one years worth of data they are about on par with each other.
Note: This is only true for forwards. The same observations are not true about defensemen where we see very little persistence or predictability in any of these metricts, I presume because the majority of them don’t drive offense to any significant degree.