May 012013
 

I brought this issue up on twitter today because it got me thinking. Many hockey analytics dismiss face off winning % as a skill that has much value but many of the same people also claim that zone starts can have a significant impact on a players statistics. I haven’t really delved into the statistics to investigate this, but here is what I am wondering.  Consider the following two players:

Player 1: Team wins 50% of face offs when he is on the ice and he starts in the offensive zone 55% of the time.

Player 2: Team wins 55% of face offs when he is on the ice but he has neutral zone starts.

Given 1000 zone face offs the following will occur:

Player 1 Player 2
Win Faceoff in OZone 275 275
Lose Faceoff in Ozone 275 225
Win Faceoff in DZone 225 275
Lose Faceoff in Dzone 225 225

Both of these players will win the same number of offensive zone face offs and lose the same number of defensive zone face offs which are the situations that intuitively should have the greatest impacts on a players statistcs. So, if Player 1 is going to be more significantly impacted by his zone starts than player 2 is impacted by his face off win % losing face offs in the offensive zone must still have a significant positive impact on the players statistics and winning face offs in the defensive zone must must still have a significant negative impact on the players statistics. If this is not the case then being able to win face offs should be more or less equivalent in importance to zone starts (and this is without considering any benefit of winning neutral zone face offs).

Now, I realize that there is a greater variance in zone start deployment than face off winning percentage, but if a 55% face off percentage is roughly equal to a 55% offensive zone start deployment and a 55% face off win% has a relatively little impact on a players statistics then a 70% zone start deployment would have a relatively little impact on the players statistics times four which is still probably relatively little.

I hope to be able to investigate this further but on the surface it seems that if face off win% is of relatively little importance it is supporting of my claim that zone starts have relatively little impact on a players statistics.

 

Apr 052013
 

I often get asked questions about hockey analytics, hockey fancy stats, how to use them, what they mean, etc. and there are plenty of good places to find definitions of various hockey stats but sometimes what is more important than a definition is some guidelines on how to use them. So, with that said, here are several tips that I have for people using advanced hockey stats.

Don’t over value Quality of Competition

I don’t know how often I’ll point out one players poor stats or another players good stats and immediately get the response “Yeah, but he always plays against the opponents best players” or “Yeah, but he doesn’t play against the oppositions best players” but most people that say that kind of thing have no real idea how much quality of opponent will affect the players statistics. The truth is it is not nearly as much as you might think.  Despite some coaches desperately trying to employ line matching techniques the variation in quality of competition metric is dwarfed by variation in quality of teammates, individual talent, and on-ice results. An analysis of Pavel Datsyuk and Valterri Filppula showed that if Filppula had Datsyuk’s quality of competition his CorsiFor% would drop from 51.05% to 50.90% and his GoalsFor% would drop from 55.65% to 55.02%. In the grand scheme of things, this are relatively minor factors.

Don’t over value Zone Stats either

Like quality of competition, many people will use zone starts to justify a players good/poor statistics. The truth is zone starts are not a significant factor either. I have found that the effect of zone starts is largely eliminated after about 10 seconds after a face off and this has been found true by others as well. I account for zone starts in statistics by eliminating the 10 seconds after an offensive or defensive zone face off and I have found doing this has relatively little effect on a players stats. Henrik Sedin is maybe the most extreme case of a player getting primarily offensive zone starts and all those zone starts took him from a 55.2 fenwick% player to a 53.8% fenwick% player when zone starts are factored out. In the most extreme case there is only a 1.5% impact on a players fenwick% and the majority of players are no where close to the zone start bias of Henrik Sedin. For the majority of players you are probably talking something under 0.5% impact on their fenwick%. As for individual stats over the last 3 seasons H. Sedin had 34 goals and 172 points in 5v5 situations and just 2 goals and 14 points came within 10 seconds of a zone face off, or about 5 points a year. If instead of 70% offensive zone face off deployment he had 50% offensive zone face off deployment instead of having 14 points during the 10 second zone face off time he may have had 10.  That’s a 4 point differential over 3 years for a guy who scored 172 points. In simple terms, about 2.3% of H. Sedin’s 5v5 points can be attributed to his offensive zone start bias.

A derivative of this is that if zone starts don’t matter much, a players face off winning percentage probably doesn’t matter much either which is consistent with other studies. It’s a nice skill to have, but not worth a lot either.

Do not ignore Quality of Teammates

I have just told you to pretty much ignore quality of competition and zone starts, what about quality of teammates? Well, to put it simply, do not ignore them. Quality of teammates matters and matters a lot. Sticking with the Vancouver Canucks, lets use Alex Burrows as an example. Burrows mostly plays with the Sedin twins but has played on Kesler’s line a bit too. Over the past 3 seasons he has played about 77.9% of his ice time with H. Sedin and about 12.3% of his ice time with Ryan Kesler and the reminder with Malhotra and others. Burrow’s offensive production is significantly better when playing with H. Sedin as 88.7% of his goals and 87.2% of his points came during the 77.9% ice time he played with H. Sedin. If Burrows played 100% of his ice time with H. Sedin and produced at the same rate he would have scored 6 (9.7%) more goals and 13 (11%) more 5v5 points over the past 3 seasons. This is far more significant than the 2.3% boost H. Sedin saw from all his offensive zone starts and I am not certain my Burrows example is the most extreme example in the NHL. How many more points would an average 3rd line get if they played mostly with H. Sedin instead of the average 3rd liner. Who you play with matters a lot. You can’t look at Tyler Bozak’s decent point totals and conclude he is a decent player without considering he plays a lot with Kessel and Lupul, two very good offensive players.

Opportunity is not talent

Kind of along the same lines as the Quality of Teammates discussion, we must be careful not to confuse opportunity and results. Over the past 2 seasons Corey Perry has the second most goals of any forward in the NHL trailing only Steven Stamkos. That might seem impressive but it is a little less so when you consider Perry also had the 4th most 5v5 minutes during that time and the 11th most 5v4 minutes.  Perry is a good goal scorer but a lot of his goals come from opportunity (ice time) as much as individual talent. Among forwards with at least 1500 minutes of 5v5 ice time the past 2 seasons, Perry ranks just 30th in goals per 60 minutes of ice time. That’s still good, but far less impressive than second only to Steven Stamkos and he is actually well behind teammate Bobby Ryan (6th) in this metric. Perry is a very good player but he benefits more than others by getting a lot of ice time  and PP ice time. Perry’s goal production is a large part talent, but also somewhat opportunity driven and we need to keep this in perspective.

Don’t ignore the percentages (shooting and save)

The percentages matter, particularly shooting percentages. I have shown that players can sustain elevated on-ice shooting percentages and I have shown that players can have an impact on their line mates shooting percentages and Tom Awad has shown that a significant portion of the difference between good players and bad players is finishing ability (shooting percentage).  There is even evidence that goal based metrics (which incorporate the percentages) are a better predictor of post season success than fenwick based metric. What corsi/fenwick metrics have going for them is more reliability over small sample sizes but once you approach a full seasons worth of data that benefit is largely gone and you get more benefit from having the percentages factored into the equation. If you want to get a better understanding of what considering the percentages can do for you, try to do a Malkin vs Gomez comparison or a Crosby vs Tyler Kennedy comparison over the past several years. Gomez and Kennedy actually look like relatively decent comparisons if you just consider shot based metrics, but both are terrible percentage players while Malkin and Crosby are excellent percentage players and it is the percentages that make Malkin and Crosby so special. This is an extreme example but the percentages should not be ignored if you want a true representation of a players abilities.

More is definitely better

One of the reason many people have jumped on the shot attempt/corsi/fenwick band wagon is because they are more frequent events than goals and thus give you more reliable metrics. This is true over small sample sizes but as explained above, the percentages matter too and should not be ignored. Luckily, for most players we have ample data to get past the sample size issues. There is no reason to evaluate a player based on half a seasons data if that player has been in the league for several years. Look at 2, 3, 4 years of data.  Look for trends. Is the player consistently a higher corsi player? Is the player consistently a high shooting percentage player? Is the player improving? Declining? I have shown on numerous occassions that goals are a better predictor of future goal rates than corsi/fenwick starting at about one year of data but multiple years are definitely better. Any conclusion about a players talent level using a single season of data or less (regardless of whether it is corsi or goal based) is subject to a significant level of uncertainty. We have multiple years of data for the majority of players so use it. I even aggregate multiple years into one data set for you on stats.hockeyanalysis.com for you so it isn’t even time consuming. The data is there, use it. More is definitely better.

WOWY’s are where it is at

In my mind WOWY’s are the best tool for advanced player evaluation. WOWY stands for with or without you and looks at how a player performs while on the ice with a team mate and while on the ice without a team mate. What WOWY’s can tell you is whether a particular player is a core player driving team success or a player along for the ride. Players that consistently make their team mates statistics better when they are on the ice with them are the players you want on your team. Anze Kopitar is an example of a player who consistently makes his teammates better. Jack Johnson is an example of a player that does not, particularly when looking at goal based metrics.   Then there are a large number of players that are good players that neither drive your teams success nor hold it back, or as I like to say, complementary players. Ideally you build your team around a core of players like Kopitar that will drive success and fill it in with a group of complementary players and quickly rid yourself of players like Jack Johnson that act as drags on the team.

 

Apr 052013
 

Yesterday HabsEyesOnThePrize.com had a post on the importance of fenwick come playoff time over the past 5 seasons. It is definitely worth a look so go check it out. In the post they look at FF% in 5v5close situations and see how well it translates into post season success. I wanted to take this a step further and take a look at PDO and GF% in 5v5close situations to see of they translate into post season success as well.  Here is what I found:

Group N Avg Playoff Avg Cup Winners Lost Cup Finals Lost Third Round Lost Second Round Lost First Round Missed Playoffs
GF% > 55 19 2.68 2.83 5 1 2 6 4 1
GF% 50-55 59 1.22 1.64 0 2 6 10 26 15
GF% 45-50 52 0.62 1.78 0 2 2 4 10 34
GF% <45 20 0.00 - 0 0 0 0 0 20
FF% > 53 23 2.35 2.35 3 2 4 5 9 0
FF% 50-53 55 1.15 1.70 2 2 1 10 22 18
FF% 47-50 46 0.52 1.85 0 0 4 3 6 33
FF% <47 26 0.54 2.00 0 1 1 2 3 19
PDO >1010 27 1.63 2.20 2 2 2 6 8 7
PDO 1000-1010 42 1.17 1.75 1 0 5 7 15 14
PDO 990-1000 47 0.91 1.95 2 1 3 4 12 25
PDO <990 34 0.56 1.90 0 2 0 3 5 24

I have grouped GF%, FF% and PDO into four categories each, the very good, the good, the mediocre and the bad and I have looked at how many teams made it to each round of the playoffs from each group. If we say that winning the cup is worth 5 points, getting to the finals is worth 4, getting to the 3rd round is worth 3, getting to the second round is worth 2, and making the playoffs is worth 1, then the Avg column is the average point total for the teams in that grouping.  The Playoff Avg is the average point total for teams that made the playoffs.

As HabsEyesOnThePrize.com found, 5v5close FF% is definitely an important factor in making the playoffs and enjoying success in the playoffs. That said, GF% seems to be slightly more significant. All 5 Stanley Cup winners came from the GF%>55 group while only 3 cup winners came from the FF%>53 group and both Avg and PlayoffAvg are higher in the GF%>55 group than the FF%>53 group. PDO only seems marginally important, though teams that have a very good PDO do have a slightly better chance to go deeper into the playoffs. Generally speaking though, if you are trying to predict a Stanley Cup winner, looking at 5v5close GF% is probably a better metric than looking at 5v5close FF% and certainly better than PDO. Now, considering this is a significantly shorter season than usual, this may not be the case as luck may be a bit more of a factor in GF% than usual but historically this has been the case.

So, who should we look at for playoff success this season?  Well, there are currently 9 teams with a 5v5close GF% > 55.  Those are Anaheim, Boston, Pittsburgh, Los Angeles, Montreal, Chicago, San Jose, Toronto and Vancouver. No other teams are above 52.3% so that is a list unlikely to get any new additions to it before seasons end though some could certainly fall out of the above 55% list. Now if we also only consider teams that have a 5v5close FF% >50% then Toronto and Anaheim drop off the list leaving you with Boston, Pittsburgh, Los Angeles, Montreal, Chicago, San Jose and Vancouver as your Stanley Cup favourites, but we all pretty much knew that already didn’t we?

 

Mar 202013
 

I generally think that the majority of people give too much importance to quality of competition (QoC) and its impact on a players statistics but if we are going to use QoC metrics let’s at least try and use the best ones available. In this post I will take a look at some QoC metrics that are available on stats.hockeyanalysis.com and explain why they might be better than those typically in use.

OppGF20, OppGA20, OppGF%

These three stats are the average GF20 (on ice goals for per 20 minutes), OppGA20 (on ice goals against per 20 minutes) and GF% (on ice GF / [on ice GF + on ice GA]) of all the opposition players that a player lined up against weighted by ice time against. In fact, these stats go a bit further in that they remove the ice time the opponent players played against the player so that a player won’t influence his own QoC (not nearly as important as QoT but still a good thing to do). So, essentially these three stats are the goal scoring ability of the opposition players, the goal defending ability of the opposition players, and the overall value of the opposition players. Note that opposition goalies are not included in the calculation of OppGF20 as it is assume the goalies have no influence on scoring goals.

The benefits of using these stats are they are easy to understand and are in a unit (goals per 20 minutes of ice time) that is easily understood. GF20 is essentially how many goals we expect the players opponents would score on average per 20 minutes of ice time. The drawback from this stat is that if good players play against good players and bad players play against bad players a good player and a bad player may have similar statistics but the good players is a better player because he did it against better quality opponents. There is no consideration for the context of the opponents statistics and that may matter.

Let’s take a look at the top 10 forwards in OppGF20 last season.

Player Team OppGF20
Patrick Dwyer Carolina 0.811
Brandon Sutter Carolina 0.811
Travis Moen Montreal 0.811
Carl Hagelin NY Rangers 0.806
Marcel Goc Florida 0.804
Tomas Plekanec Montreal 0.804
Brooks Laich Washington 0.800
Ryan Callahan NY Rangers 0.799
Patrik Elias New Jersey 0.798
Alexei Ponikarovsky New Jersey 0.795

You will notice that every single player is from the eastern conference. The reason for this is that the eastern conference is a more offensive conference. Taking a look at the top 10 players in OppGA20 will show the opposite.

Player Team OppGF20
Marcus Kruger Chicago 0.719
Jamal Mayers Chicago 0.720
Mark Letestu Columbus 0.721
Andrew Brunette Chicago 0.723
Andrew Cogliano Anaheim 0.723
Viktor Stalberg Chicago 0.724
Matt Halischuk Nashville 0.724
Kyle Chipchura Phoenix 0.724
Matt Belesky Anaheim 0.724
Cory Emmerton Detroit 0.724

Now, what happens when we look at OppGF%?

Player Team OppGF%
Mike Fisher Nashville 51.6%
Martin Havlat San Jose 51.4%
Vaclav Prospal Columbus 51.3%
Mike Cammalleri Calgary 51.3%
Martin Erat Nashville 51.3%
Sergei Kostitsyn Nashville 51.3%
Dave Bolland Chicago 51.2%
Rick Nash Columbus 51.2%
Travis Moen Montreal 51.0%
Patrick Marleau San Jose 51.0%

There are predominantly western conference teams with a couple of eastern conference players mixed in. The reason for this western conference bias is that the western conference was the better conference and thus it makes sense that the QoC would be tougher for western conference players.

OppFF20, OppFA20, OppFF%

These are exactly the same stats as the goal based stats above but instead of using goals for/against/percentage they use fenwick for/against/percentage (fenwick is shots + shots that missed the net). I won’t go into details but you can find the top players in OppFF20 here, in OppFA20 here, and OppFF% here. You will find a a lot of similarities to the OppGF20, OppGA20 and OppGF% lists but if you ask me which I think is a better QoC metric I’d lean towards the goal based ones. The reason for this is that the smaller sample size issues we see with goal statistics is not going to be nearly as significant in the QoC metrics because over all opponents luck will average out (for every unlucky opponent you are likely to have a lucky one t cancel out the effects). That said, if you are doing a fenwick based analysis it probably makes more sense to use a fenwick based QoC metric.

HARO QoC, HARD QoC, HART QoC

As stated above, one of the flaws of the above QoC metrics is that there is no consideration for the context of the opponents statistics. One of the ways around this is to use the HockeyAnalysis.com HARO (offense), HARD (defense) and HART (Total/Overall) ratings in calculating QoC. These are player ratings that take into account both quality of teammates and quality of competition (here is a brief explanation of what these ratings are).The HARO QoC, HARD QoC and HART QoC metrics are simply the average HARO, HARD and HART ratings of players opponents.

Here are the top 10 forwards in HARO QoC last year:

Player Team HARO QoC
Patrick Dwyer Carolina 6.0
Brandon Sutter Carolina 5.9
Travis Moen Montreal 5.8
Tomas Plekanec Montreal 5.8
Marcel Goc Florida 5.6
Carl Hagelin NY Rangers 5.5
Ryan Callahan NY Rangers 5.3
Brooks Laich Washington 5.3
Michael Grabner NY Islanders 5.2
Patrik Elias New Jersey 5.2

There are a lot of similarities to the OppGF20 list with the eastern conference dominating. There are a few changes, but not too many, which really is not that big of a surprise to me knowing that there is very little evidence that QoC has a significant impact on a players statistics and thus considering the opponents QoC will not have a significant impact on the opponents stats and thus not a significant impact on a players QoC. That said, I believe these should produce slightly better QoC ratings. Also note that a 6.0 HARO QoC indicates that the opponent players are expected to produce a 6.0% boost on the league average GF20.

Here are the top 10 forwards in HARD QoC last year:

Player Team HARD QoC
Jamal Mayers Chicago 6.0
Marcus Kruger Chicago 5.9
Mark Letestu Columbus 5.8
Tim Jackman Calgary 5.3
Colin Fraser Los Angeles 5.2
Cory Emmerton Detroit 5.2
Matt Belesky Anaheim 5.2
Kyle Chipchura Phoenix 5.1
Andrew Brunette Chicago 5.1
Colton Gilles Columbus 5.0

And now the top 10 forwards in HART QoC last year:

Player Team HART QoC
Dave Bolland Chicago 3.2
Martin Havlat San Jose 3.0
Mark Letestu Columbus 2.5
Jeff Carter Los Angeles 2.5
Derick Brassard Columbus 2.5
Rick Nash Columbus 2.4
Mike Fisher Nashville 2.4
Vaclav Prospal Columbus 2.2
Ryan Getzlaf Anaheim 2.2
Viktor Stalberg Chicago 2.1

Shots and Corsi based QoC

You can also find similar QoC stats using shots as the base stat or using corsi (shots + shots that missed the net + shots that were blocked) on stats.hockeyanalysis.com but they are all the same as above so I’ll not go into them in any detail.

CorsiRel QoC

The most common currently used QoC metric seems to be CorsiRel QoC (found on behindthenet.ca) but in my opinion this is not so much a QoC metric but a ‘usage’ metric. CorsiRel is a statistic that compares the teams corsi differential when the player is on the ice to the teams corsi differential when they player is not on the ice.  CorsiRel QoC is the average CorsiRel of all the players opponents.

The problem with CorsiRel is that good players on a bad team with little depth can put up really high CorsiRel stats compared to similarly good players on a good team with good depth because essentially it is comparing a player relative to his teammates. The more good teammates you have, the more difficult it is to put up a good CorsiRel. So, on any given team the players with a good CorsiRel are the best players on team team but you can’t compare CorsiRel on players on different teams because the quality of the teams could be different.

CorsiRel QoC is essentially the average CorsiRel of all the players opponents but because CorsiRel is flawed, CorsiRel QoC ends up being flawed too. For players on the same team, the player with the highest CorsiRel QoC plays against the toughest competition so in this sense it tells us who is getting the toughest minutes on the team, but again CorsiRel QoC is not really that useful when comparing players across teams.  For these reasons I consider CorsiRel QoC more of a tool to see the usage of a player compared to his teammates, but is not in my opinion a true QoC metric.

I may be biased, but in my opinion there is no reason to use CorsiRel QoC anymore. Whether you use GF20, GA20, GF%, HARO QoC, HARD QoC, and HART QoC, or any of their shot/fenwick/corsi variants they should all produce better QoC measures that are comparable across teams (which is the major draw back of CorsiRel QoC.

 

Feb 112013
 

When I updated stats.hockeyanalysis.com this season I added new metrics for Quality of Teammates (QoT) and Quality of Competition (Q0C). The QoC metrics are essentially the average Hockey Analysis Rating (HARO for offense, HARD for defense and HART for overall) of the opponents that the player plays against. What is interesting about these ratings, as compared to those found elsewhere, is that I split the QoC rating up into offensive and defensive metrics. Thus, there is a QoC HARO rating for measuring the offensive quality of competition, a QoC HARD for measuring the defensive quality of competition, and a QoC HART for overall quality of compentition (basically the average of QoC HARO + QoC HARD). The resulting metrics give a result that is above 1.00 for above average competition and below 1.00 for below average competition and 1.00 would be average competition.

Let’s take a look at defensemen first and take a look at the defensemen who have the highest QoC HARO during 5v5close situations over the previous 2 seasons. This should identify the defensemen who have face the best offensive players and her are the top 15.

Player Name HARO QOC
GIRARDI, DAN 1.036
CHARA, ZDENO 1.036
GARRISON, JASON 1.035
MCDONAGH, RYAN 1.034
WEAVER, MIKE 1.033
GORGES, JOSH 1.031
ALZNER, KARL 1.029
GLEASON, TIM 1.026
SEABROOK, BRENT 1.025
BOYCHUK, JOHNNY 1.025
SUBBAN, P.K. 1.025
PHANEUF, DION 1.025
CARLSON, JOHN 1.022
HAMONIC, TRAVIS 1.021
LIDSTROM, NICKLAS 1.021

That’s actually a pretty decent representation of defensive defensemen though there is a bias towards the eastern conference in large part because the eastern conference has more offense (the top 4 teams in goals for last year were eastern conference teams while 9 of the 11 lowest scoring teams were from the western conference).

Now, lets take a look at the forwards with the toughest offensive competition.

Player Name HARO QOC
SUTTER, BRANDON 1.032
PERRON, DAVID 1.032
CALLAHAN, RYAN 1.031
FISHER, MIKE 1.03
SYKORA, PETR 1.029
BOLLAND, DAVE 1.028
ZAJAC, TRAVIS 1.028
ELIAS, PATRIK 1.028
BERGERON, PATRICE 1.027
HAGELIN, CARL 1.027
ZUBRUS, DAINIUS 1.027
PLEKANEC, TOMAS 1.027
WEISS, STEPHEN 1.026
RECCHI, MARK 1.026
ERAT, MARTIN 1.025

Not a lot of surprises there.  They are mostly third line defense first players (IMO Brandon Suter is the best defensive center in the NHL and this is just more evidence of why) or quality 2-way players though as you go further down the list you start to see more offensive players showing up like Alfredsson and Spezza which is probably evidence of a coach wanting to line match top line against top line instead of a checking line against top line.

Where things get interesting is looking at who is 300th on the list of forwards in HARO QoC. It’s none other than Manny Malhotra of massive defensive zone start bias fame. Malhotra’s HARO QoC is just 0.980 while the Canucks center who is assigned mostly offensive zone starts, Henrick Sedin, has a HARO QoC 0.994, which isn’t real difficult but is somewhat higher than Malhotra’s. So, despite all those defensive zone starts by Malhotra (presumably because he is considered a better defensive player), Henrik Sedin plays against tougher offensive opponents. How can this be? Despite Malhotra’s significant defensive zone start bias his five most frequent 5v5close opponent forwards over the previous 2 seasons are David Jones, Matt Stajan, Tim Jackman, Joran Eberle, Matt Cullen. Aside from Eberle those guys don’t really scare you much. It seems Malhotra was facing Edmonton’s top line but not Calgary’s, Minnesota’s or Colorado’s. Henrik Sedin’s top 5 opposition forwards are Dave Bolland, Dany Heatley, Curtis Glencross, Olli Jokinen and Jarome Iginla. Beyond that you have Backes, O’Reilly, Bickell, Thornton, Zetterberg, and Getzlaf. Despite the massive offensive zone start bias, it seems the majority of teams are still line matching power vs power with the Sedins. The conclusion is defensive zone starts does not immediately imply playing against quality offensive players. It can be argued that despite the defensive zone starts Manny Malhotra plays relatively easy minutes.

Using a rigid zone start system like the Vancouver Canucks do actually makes it easier for opposing teams to line match on the road as they know who you are likely to be putting on the ice depending on where the face off is. If the San Jose Sharks want to avoid a Thornton against Malhotra matchup, just don’t start Thornton in the offensive zone. Here are all the forwards with >750 5v5close minutes and at least 40% of the face offs they were on the ice for being in the defensive zone along with their HARO QoC.

Player Name HARO QOC
Manny Malhotra 0.980
Jerred Smithson 0.977
Max Lapierre 0.970
Adam Burish 0.982
Steve Ott 0.993
Jay McClement 0.983
Sammy Pahlsson 1.014
Brian Boyle 1.010
Dave Bolland 1.028
Kyle Brodziak 1.002
Matt Cullen 0.998
Paul Gaustad 0.993

Only 4 of the 12 heavy defensive zone start forwards faced opposition that was above average in terms of quality while the majority of them rank quite poorly.

It is also interesting to see who plays against the best defensive forwards.  One might assume it is elite offensive first line players but as we saw above, teams seemed to want to avoid matching up top offensive players against Manny Malhotra. So, let’s take a look.

Player Name HARD QOC
FRASER, COLIN 1.044
BOLL, JARED 1.043
MAYERS, JAMAL 1.037
JACKMAN, TIM 1.035
MACKENZIE, DEREK 1.032
ABDELKADER, JUSTIN 1.031
CLIFFORD, KYLE 1.031
EAGER, BEN 1.029
BELESKEY, MATT 1.028
MILLER, DREW 1.028
KOSTOPOULOS, TOM 1.027
MCLEOD, CODY 1.025
NICHOL, SCOTT 1.024
WINCHESTER, BRAD 1.023
PAILLE, DANIEL 1.021

Pretty much only tough guys and 3rd/4th liners on that list. Teams are deliberately using the above players in situations that avoid them facing top offensive players and as a result are facing other teams third and fourth lines and thus are facing more defensive type players.

The one conclusion we can draw from this analysis is that quality of competition is driven by line matching techniques more so than zone starts.

 

Jan 252013
 

The last few days I have been looking at the percentage of a teams ice time for a given situation that a particular player is on the ice for.  So for instance, what percentage of the Leafs 5v5 even strength ice time was Joffrey Lupul on the ice in games in which Joffrey Lupul played. When I write a new program to calculate these numbers I need to to some testing to make sure the results are correct.  The first test is always the standard sniff test.  When the program runs I look at the output and ask myself “does the output make sense?”. When I first looked at the output the other day one of the numbers surprised me so much that I had to do some double checking to make sure it made sense. That number was the percentage of his teams power play ice time that Ilya Kovalchuk was on the ice for. That number was 87.25%.

That’s insane I thought so off to NHL.com to check and see if it could be at all possible. I first checked and noticed that the Devils had 439:59 minutes of PP ice time last year, including 420:36 minutes of 5v4 ice time. Next I checked out much PP ice time Kovalchuk had last year and see that he had 379:08 minutes of PP time. I do not know his exact 5v4 PP ice time numbers but 379:08 is about 86% of 439:59 so my calculation of Kovalchuk being on the ice for 87.25% of his teams PP ice time is perfectly within reason.

To me this seems like a crazy high number.  It means for every 2 minute penalty Kovalchuk is on the ice for 1:44 of it. That just makes me say “WOW!” but Kovalchuk is not alone in getting big PP minutes.  Here are some other players who have played in >70% of his teams 5v5 PP minutes (in games he played in) over the past 5 seasons.

Player 5v4 TOI%
Ilya Kovalchuk 87.25%
Alex Ovechkin 83.08%
Mike Green 76.86%
Mark Streit 75.35%
Sergei Gonchar 74.76%
Evgeni Malkin 73.83%
Sidney Crosby 73.01%
Dan Boyle 72.78%

I knew some players played a lot of PP ice time, but that still astonishes me. Oh, and for the record, in addition to being on the ice for 87.25% of his teams 5v4 PP ice time, Kovalchuk was on the ice for 89.66% of his teams 5v4 PP goals.

On the other end of things, over the last 5 years Willie Mitchell has played a whopping 59.2% of his teams 4v5 PK ice time which is might actually be more impressive considering how much more demanding playing on the PK is.

 

Jan 232013
 

One of the challenges in hockey analytics, or any type of data analysis, is how to best visualize data in a way that is exceptionally informative and yet really simple to understand. I have been working on a few things can came up with something that I think might be a useful tool to understand how a player gets utilized by his coach.

Let’s start with some background. We can get an idea of how a player is utilized by looking at when the player gets used and how frequently he gets used.  Offensive players get more ice time on the power play and more ice time when their team is trailing and needs a goal. Defensive players get more ice time on the PK and when they are protecting a lead. This all makes sense, but the issue is some teams spend more time on the PP or PK than others while bad teams end up trailing more than good teams and leading less. This means doing a straight time on ice comparison between players on different teams doesn’t always accurately depict the usage of the player. If a player on the Red Wings plays the same number of minutes with the lead as a player on the Blue Jackets it doesn’t mean the players are used int he same way.  The Blue Jackets will lead a game significantly less than the Red Wings thus in the hypothetical example above the Blue Jackets are depending on their player a higher percent of the time with a lead than the Red Wings are their player.

To get around this I looked at percentages. If Player A played 500 minutes with a lead and his team played a total of 2000 minutes with a lead during games which Player A played, then Players A’s ice time with a lead percentage would be 25%. In games in which Player A played he was used in 25% of the teams time leading. I can calculated these percentages for any situation from 5v5 to 4v5 or 5v4 special teams to leading and trailing situations. The challenge is to visualize the data in a clear and understandable way. To do this I use radar charts. Lets look at a couple examples so you get an idea and we’ll use players that have extreme and opposite usages: Daniel Sedin and Manny Malhotra.

For those not up to speed on my terminology f10 is zone start adjusted ice time which ignores the 10 seconds after a face off in either the offensive or defensive zone.

The charts above are largely driven by PP and PK ice time but players that are used more often in offensive roles will have their charts bulge to the top and top right while those in more defensive roles will have their charts bulge more to the bottom and bottom left. Also, the larger the ‘polygon’ the more ice time and more relied on the player is. In the examples above, Sedin is clearly used more often in offensive situations and clearly gets more ice time.

Let’s now look at a player who is used in a more balanced way, Zdeno Chara.

That is a chart that is representative of a big ice time player who plays in all situations. We can then take it a step further and compare players such as the following.

In normal 5v5 situations Gardiner was depended on about as much as Phaneuf, but Phaneuf was relied on a lot more on special teams and a bit more when protecting a lead. Of course, you can also compare across teams with these charts:

Phaneuf and Chara were depended on almost equally in all situations except on the PP where Phaneuf was used far more frequently.

I am not sure where I will go with these charts but I think I’ll look at them from time to time as I am sure they will be of use in certain situations and I have a few ideas as to how to expand on them to make them even more interesting/useful.

 

Nov 082012
 

Eric T. over at NHL Numbers had a post last week summarizing the current state of our statistical knowledge with respect to accounting for zone start differences.  If you haven’t read it definitely go read it because it is not only a good read but because it concludes that how the majority of people have been doing is is wrong.

Overall, no two estimates are in direct agreement, but the analyses that are known to derive from looking directly at the outcomes immediately following a faceoff converge in the range of 0.25 to 0.4 Corsi shots per faceoff — one-third to one-half of the figure in widespread use. It is very likely that we have been overestimating the importance of faceoffs; they still represent a significant correction on shot differential, but perhaps not as large as has been previously assumed.

In the article Eric refers to my observation that eliminating the 10 seconds after a zone start effectively removes any effect that the zone start had on the game.  From there he combined my zone start adjusted data found at stats.hockeyanalysis.com with zone start data from behindthenet.ca and came up with an estimate that a zone start is worth 0.35 corsi.  He did this by subtracting the 10 second zone start adjusted corsi from standard 5v5 corsi and then running a regression against the extra offensive zone starts the player had.  In the comments I discussed some further analysis I did on this using my own data (i.e. not the stuff on behindthenet.ca) and came up with similar, though slightly different, numbers.  In any event I figured the content of that comment was worthy of its own post here.

So, when I did the correlation between extra offensive zone starts and difference between 5v5 and 5v5 10 second zone start adjusted corsi I got the following (using all players with >1000 minutes of ice time over last 5 seasons):

My calculations come up with a slope of 0.3043 which is a little below that of Eric’s calculations but since I don’t know the exact methodology he used that might explain the difference (i.e. not sure if Eric used complete 5 years of data, or individual seasons).

What is interesting is that when I explored things further, I noticed that the results varied across positions, but varied very little across talent levels.  Here are some more correlations for different positions and ice time restrictions.

Position Slope r^2
All Players >1000 min. 0.30 0.55
Skaters >1000 min. 0.28 0.52
Forwards >1000 min. 0.26 0.50
Defensemen >1000 min. 0.33 0.57
Goalies >1000 min. 0.44 0.73
Forwards >500 min. 0.26 0.50
Forwards >2500 min. 0.26 0.52
Forwards 500-2500 min. 0.26 0.39

Two observations:

1.  The slope for forwards is less than the slope for defensemen which is (quite a bit) less than the slope for goalies.

2.  There is no variation in slope no matter what restrictions we put on a forwards ice time.

There isn’t really much to say regarding the second observation except that it is nice to see consistency but the first observation is quite interesting.  Goalies, who have no impact on corsi, see the greatest zone start influences on corsi of any position.  It is a little odd but I think it addresses one of the concerns that Eric had pointed out in his article:

The next step would be to remove the last vestige of sampling bias from our analysis. The approaches that focus on the period immediately after the faceoff reduce the impact of teams’ tendency to use their best forwards in the offensive zone, but certainly do not remove it altogether.

I think that is exactly what we are witnessing here, but maybe more importantly teams put out their best defensive players and, maybe more importantly, their best face off guys for defensive zone face offs. If David Steckel, who is an excellent face off guy, is getting all the defensive zone face offs, it is naturally going to suppress the corsi events immediately after the defensive zone face off because he is going to win the draw more often than not.  There is probably more line matching done for the zone face offs than during regular play so the line matching suppresses some of the zone start impact.  It is more difficult to line match when changing lines on the fly so a good coach can more easily get favourable line matches. The result is normal 5v5 play offensive players might see a boost to their corsi (because they can exploit good matchups) and during offensive zone face offs they see their corsi suppressed because they will almost always be facing good defensive players and top face off guys.  Thus, the boost to corsi based on a zone start is not as extreme as should be for offensive players.  The opposite is true for defensive players.

Defensemen are less often line matched so we see their corsi boost due to an offensive zone face off a little higher than that of forwards, but it isn’t near as high as goalies because there are defensemen that are primarily used in offensive situations and others that are primarily used in defensive situations.

Goalies though, tell us the real effect because they are always on the ice and they are not subject to any line matching.  In the table above you will notice that goalies have a significantly higher slope and an impressively high r^2.  I feel I have to post the chart of the correlation because it really is a nice chart to look at.

I have looked at a lot of correlations and charts in hockey stats but very few of them are as nice with as high a correlation as the chart above.

I believe that this is telling us that an offensive zone start is worth 0.44 corsi, but only when a player is playing against similarly defensively capable players as he would during regular 5v5 play which I speculate above is not necessarily (or likely) the case.  The 0.44 adjustment really only applies to an idealistic situation that doesn’t normally occur for any players other than goalies.  So where does that leave us?  Should we use a zone start adjustment of 0.44 corsi for all players, or should we use something like 0.33 for defensemen and 0.26 for forwards?  The answer isn’t so simple.  One could argue that we should apply 0.44 to all players and then make some sort of QoC adjustment and that would make some sense.  But if we are not intending to apply a QoC adjustment, does that mean we should use 0.33 and 0.26?  Maybe, but that is a little inconsistent because it would mean you are using a QoC adjustment only for the zone start adjustment of a players stats, and not for all his stats.  The answer for me is what I have been doing the past little while and not even attempt to adjust a players stats based on zone starts differences and rather simply just ignore the the portion of play that is subject to being influenced by zone starts – the 10 seconds after a zone start face off.  To me it seems like the simplest and easiest thing to do.

 

Oct 302012
 

Offensive players generally get all of the attention but defensive players are often just as valuable to a team.  Ask any NHL fan who the top offensive centers in the league are and they will quickly ramble off a few names from Crosby to Stamkos to Getzlaf to Malkin, etc.  Ask a fan to list the top defensive centers and the task becomes a little more difficult.  So, I decided to look into defensive centers a little further.

What makes a valuable defensive center?  Well, they should play against tough competition, they should give up fewer goals than expected, and they should be trusted to play a lot on the penalty kill.  So, with that in mind, I decided to set the following parameters in my defensive center search.

1.  I limited myself to players who have played >2000 minutes of 5v5 zone start adjusted ice time over the past three seasons.

2.  I only considered players who had an average opposition goals for per 20 minutes of ice time above 0.800 (i.e. only consider players who played against tough offensive opponents, must have OppGF20>0.800).

3. I then eliminated all forwards with a goals against per 20 minutes of ice time >0.800 (i.e. eliminate players who didn’t get good defensive results, must have GA20<0.800).

4.  I then took each players on ice goals against rate and divided it by his line mates goals against rate to ensure that they are performing better than their line mates and make their line mates better defensively (GA20/TMGA20 < 1.00).

5.  I then eliminated any players who didn’t have >300 minutes of 4v5 PK ice time over the past 3 seasons.

After doing this I got the following list of players sorted by GA20/TMGA20, or in English  sorted by how much better defensively they were than their line mates.

  1. Brandon Sutter
  2. Samuel Pahlsson
  3. Mikko Koivu
  4. Frans Nielsen
  5. Travis Zajac
  6. Martin Hanzal
  7. Mike Richards
  8. Brooks Laich
  9. Jordan Staal
  10. Joe Pavelski

Honorable Mentions:  Logan Couture, Pavel Datsyuk, Mikhail Grabovski and Alexander Steen missed the cut due to not having enough PK minutes.  Couture would have been slotted second behind Sutter, Datsyuk between Pahlsson and Koivu, and Grabovski and Steen immediately after Hanzal.  Plekanec, Kopitar, Bergeron and Legwand met the PK ice time criteria and would come in after Pavelski except that their line mates had a better GA20 when not playing with them so they were cut from the list.

All in all I am pretty happy with the defensive forward list above.  They all make sense and the only real surprise on the list might be Frans Nielsen but that is mostly because I don’t pay attention to he Islanders (who does really?) and this haven’t really paid much attention to him.  For a player on the lowly Islanders to meet these criteria it probably means he is a pretty good defensive player.

It is interesting to see Sutter and Jordan Staal both make this list as they were traded for each other this past summer.  When I compared these two players after the trade when down I suggested that Sutter is one of the best defensive forwards in the NHL and this certainly backs that up.

What do you think?  Am I missing someone from this list of elite defensive centers?

 

Oct 192012
 

There seems to be a lot of pessimism after the NHL walked out on negotiations with the NHLPA yesterday but the reality is that the NHL and NHLPA have come a long way.  The initial NHL offer to the NHLPA was that the players would get a 43% share of revenue.  The initial offer from the NHLPA to the NHL was that the players would subsidize a larger revenue sharing pool for 3 seasons through a reduction in their share of revenue but then bounce back to a 57% share in year 4.  As of yesterday, both the owners and players now agree that in the long term they should split revenues 50/50.  The disagreement is that the owners want the 50/50 share immediately while the players want to phase it in part in order to ensure existing contracts are honored in full (which is a bit of a bargaining/propaganda ploy because contract values were never guaranteed and always tied to revenue and the CBA).

James Mirtle of The Globe and Mail has a good run down on the difference between the first two player proposals relative to the owners proposal.  In essence, the players proposals nets the players an additional $500M (approximately) over the next several years before the 50/50 level is reached.  This is not insignificant but it only accounts for approximately 2.2% of the projected $22.5B in projected revenue over the term of the CBA assuming 5% projected revenue growth per year.

The owners had a “make whole” agreement in their proposal which was designed to appease the players by honoring existing contracts but it was a bit of a marketing/propaganda ploy as well because essentially what it did was taking salary from players a couple years from now to make up the short fall in the first two years of the CBA.  The owners proposal called this a “Deferred Compensation benefit” but in reality it was a “deferred claw back penalty.”

The solution to this mess, I believe  is for the owners to step up and volunteer to pay the make whole amount which they estimated as being up to $149M in 2012-13 and up to $62M in year 2013-14 for a total of up to $211M (nicely somewhat close to half of the extra money they players want).  As stated above, projected revenue over the 6 year term of the contract is $22.5B.  I propose the owners take responsibility for the make whole portion of their proposal and they can pay the deferred salary in the amount of 1% of overall revenue until the up to $211M is paid in full.  This will essentially peg the players share at 51% and the owners share at 49% until the $211M in deferred salary is paid in full at which time it drops to a 50/50 split.  This seems like a perfectly reasonable compromise to me.  Now lets get it done and get back to playing hockey.