How are HockeyAnalysis ratings (HARO, HARD, HART) calculated?
Every now and again someone asks me how I calculate HARO, HARD and HART ratings that you can find on stats.hockeyanalysis.com and it is at that point I realize that I don’t have an up to date description of how they are calculated so today I endeavor to write one.
First, let me define HARO, HARD and HART.
HARO – Hockey Analysis Rating Offense
HARD – Hockey Analysis Rating Defense
HART – Hockey Analysis Rating Total
So my goal when creating then was to create an offensive defensive and overall total rating for each and every player. Now, here is a step by step guide as to how they are calculated.
Calculate WOWY’s and AYNAY’s
The first step is to calculate WOWY’s (With Or Without You) and AYNAY’s (Against You or Not Against You). You can find goal and corsi WOWY’s and AYNAY’s on stats.hockeyanalysis.com for every player for 5v5, 5v5 ZS adjusted and 5v5 close zone start adjusted situations but I calculate them for every situation you see on stats.hockeyanalysis.com and for shots and fenwick as well but they don’t get posted because it amounts to a massive amounts of data.
(Distraction: 800 players playing against 800 other players means 640,000 data points for each TOI, GF20, GA20, SF20, SA20, FF20, FA20, CF20, CA20 when players are playing against each other and separate of each other per season and situation, or about 17.28 million data points for AYNAY’s for a single season per situation. Now consider when I do my 5 year ratings there are more like 1600 players generating more than 60 million datapoints.)
Calculate TMGF20, TMGA20, OppGF20, OppGA20
What we need the WOWY’s for is to calculate TMGF20 (a TOI with weighted average GF20 of the players teammates when his team mates are not playing with him), TMGA20 (a TOI with weighted average GA20 of the players teammates when his team mates are not playing with him), OppGF20 (a TOI against weighted average GF20 of the players opponents when his opponents are not playing against him) and OppGA20 (a TOI against weighted average GA20 of the players opponents when his opponents are not playing against him).
So, let’s take a look at Alexander Steen’s 5v5 WOWY’s for 2011-12 to look at how TMGF20 is calculated. The columns we are interested in are the Teammate when apart TOI and GF20 columns which I will call TWA_TOI and TWA_GF20. TMGF20 is simply a TWA_TOI (teammate while apart time on ice) weighted average of TWA_GF20. This gives us a good indication of how Steen’s teammates perform offensively when they are not playing with Steen.
TMGA20 is calculated the same way but using TWA_GA20 instead of TWA_GF20. OppGF20 is calculated in a similar manner except using OWA_GF20 (Opponent while apart GF20) and OWA_TOI while OppGA20 uses OWA_GA20.
The reason why I use while not playing with/against data is because I don’t want to have the talent level of the player we are evaluating influencing his own QoT and QoC metrics (which is essentially what TMGF20, TMGA20, OppGF20, OppGA20 are).
Calculate first iteration of HARO and HARD
The first iteration of HARO and HARD are simple. I first calculate an estimated GF20 and an estimated GA20 based on the players teammates and opposition.
ExpGF20 = (TMGF20 + OppGA20)/2
ExpGA20 = (TMGA20 + OppGF20)/2
Then I calculate HARO and HARD as a percentage improvement:
HARO(1st iteration) = 100*(GF20-ExpGF20) / ExpGF20
HARD(1st iteration) = 100*(ExpGA20 – GA20) / ExpGA20
So, a HARO of 20 would mean that when the player is on the goal rate of his team is 20% higher than one would expect based on how his teammates and opponents performed during time when the player is not on the ice with/against them. Similarly, a HARD of 20 would mean the goals against rate of his team is 20% better (lower) than expected.
(Note: The OppGA20 that gets used is from the complimentary situation. For 5v5 this means the opposition situation is also 5v5 but when calculating a rating for 5v5 leading the opposition situation is 5v5 trailing so OppGF20 would be OppGF20 calculated from 5v5 trailing data).
Now for a second iteration
The first iteration used GF20 and GA20 stats which is a good start but after the first iteration we have teammate and opponent corrected evaluations of every player which means we have better data about the quality of teammates and opponents the player has. This is where things get a little more complicated because I need to calculate a QoT and QoC metric based on the first iteration HARO and HARD values and then I need to convert that into a GF20 and GA20 equivalent number so I can compare the players GF20 and GA20 to.
To do this I calculate a TMHARO rating which is a TWA_TOI weighted average of first iteration HARO. TMHARD and OppHARO and OppHARD are calculated in a similar manner. TMHARD, OppHARO and OppHARD are similarly calculated. Now I need to convert these to GF20 and GA20 based stats so I do that by multiplying by league average GF20 (LAGF20) and league average GA20 (LAGA20) and from here I can calculated expected GF20 and expected GA20.
ExpGF20(2nd iteration) = (TMHARO*LAGF20 + OppHARD*LAGA20)/2
ExpGA20(2nd iteration) = (TMHARD*LAGA20 + OppHARD*LAGF20)/2
From there we can get a second iteration of HARO and HARD.
HARO(2nd iteration) = 100*(GF20-ExpGF20) / ExpGF20
HARD(2nd iteration) = 100*(ExpGA20 – GA20) / ExpGA20
Now we iterate again and again…
Now we repeat the above step over and over again using the previous iterations HARO and HARD values at every step.
Now calculate HART
Once we have done enough iterations we can calculate HART from the final iterations HARO and HARD values.
HART = (HARO + HARD) /2
Now do the same for Shot, Fenwick and Corsi data
The above is for goal ratings but I have Shot, Fenwick and Corsi ratings as well and these can be calculated in the exact same way except using SF20, SA20, FF20, FA20, CF20 and CA20.
What about goalies?
Goalies are a little unique in that they only really play the defensive side of the game. For this reason I do not include goalies in calculating TMGF20 and OppGF20. For shot, fenwick and corsi I do not include the goalies on the defensive side of things either as I assume a goalie will not influence shots against (though this may not be entirely true as some goalies may be better at controlling rebounds and thus secondary shots but I’ll assume this is a minimal effect if it does exist). The result of this is goalies do have a HARD rating but no HARO, or shot/fenwick/corsi based HARD or HARO rating.
I hope this helps explain how my hockey analysis ratings are calculated but if you have any followup questions feel free to ask them in the comments.