Apr 012013
 

I have been on a bit of a mission recently to push the idea that quality of competition (and zone starts) is not a huge factor in ones statistics and that most people in general over value its importance. I don’t know how often I hear arguments like “but he plays all the tough minutes” as an excuse as to why a player has poor statistics and pretty much every time I do I cringe because almost certainly the person making the argument has no clue how much those tough minutes impact a players statistics.

While thinking of how to do this study, and which players to look at, I was listening to a pod cast and the name Pavel Datsyuk was brought up so I decided I would take a look at him because in addition to being mentioned in a pod cast he is a really good 2-way player who plays against pretty tough quality of competition. For this study I looked at 2010-12 two year data and Datsyuk has the 10th highest HART QoC during that time in 5v5 zone start adjusted situations.

The next step was to look how Datsyuk performed against various types of opposition. To do this I took all of Datsyuk’s opponent forwards who had he played at least 10 minutes of 5v5 ZS adjusted ice time against (you can find these players here) and grouped them according to their HARO, HARD, CorHARO and CorHARD ratings and looked at how Datsyuk’s on-ice stats looked against each group.

OppHARO TOI% GA20
>1.1 46.84% 0.918
0.9-1.1 34.37% 0.626
<0.9 18.79% 0.391

Lets go through a quick explanation of the above table. I have grouped Datsyuk’s opponents by their HARO ratings into three groups, those with a HARO >1.1, those with a HARO between 0.9 and 1.1 and those with a HARO rating below 0.9. These groups represent strong offensive players, average offensive players and weak offensive players. Datsyuk played 46.84% of his ice time against the strong offensive player group, 34.37% against the average offensive player group and 18.79% against the weak offensive player group. The GA20 column is Datsyuk’s goals against rate, or essentially the goals for rate of Datsyuk’s opponents when playing against Datsyuk. As you can see, the strong offensive players do significantly better than the average offensive players who in turn do significantly better than the weak offensive players.

Now, let’s look at how Datsyuk does offensively based on the defensive ability of his opponents.

OppHARD TOI% GF20
>1.1 35.39% 1.171
0.9-1.1 35.36% 0.994
<0.9 29.25% 1.004

Interestingly, the defensive quality of Datsyuk’s opponents did not have a significant impact on Datsyuk’s ability to generate offense which is kind of an odd result.

Here are the same tables but for corsi stats.

OppCorHARO TOI% CA20
>1.1 15.59% 15.44
0.9-1.1 77.79% 13.78
<0.9 6.63% 10.84

 

OppCorHARD TOI% CF20
>1.1 18.39% 15.89
0.9-1.1 68.81% 18.49
<0.9 12.80% 22.69

I realize that I should have tightened up the ratings splits to get a more even distribution in TOI% but I think we see the effect of QoC fine. When looking at corsi we do see that CF20 varies across defensive quality of opponent which we didn’t see with GF20.

From the tables above, we do see that quality of opponent can have a significant impact on a players statistics. When you are playing against good offensive opponents you are bound to give up a lot more goals than you will against weaker offensive opponents. The question remains is whether players can and do play a significantly greater amount of time against good opponents compared to other players. To take a look at this I looked at the same tables above but for Valtteri Filppula, a player who rarely gets to play with Datsyuk so in theory could have a significantly different set of opponents to Datsyuk. Here are the same tables above for Filppula.

OppHARO TOI% GA20
>1.1 42.52% 1.096
0.9-1.1 35.35% 0.716
<0.9 22.12% 0.838

 

OppHARD TOI% GF20
>1.1 32.79% 0.841
0.9-1.1 35.53% 1.197
<0.9 31.68% 1.370

 

OppCorHARO TOI% GA20
>1.1 12.88% 19.03
0.9-1.1 78.20% 16.16
<0.9 8.92% 14.40

 

OppCorHARD TOI% GF20
>1.1 20.89% 15.48
0.9-1.1 64.94% 17.16
<0.9 14.17% 19.09

Nothing too exciting or unexpected in those tables. What is more important is how the ice times differ from Datsyuk’s across groups and how those differences might affect Filppula’s statistics.

We see that Datsyuk plays a little bit more against good offensive players and a little bit less against weak offensive players and he also plays a little bit more against good defensive players and a little bit less against weak defensive players. If we assume that Filppula played Datsyuk’s and that Datsyuk’s within group QoC ratings was the same as Filppula’s we can calculate what Filppula’s stats will be against similar QoC.

Actual w/ DatsyukTOI
GF20 1.135 1.122
GA20 0.905 0.917
GF% 55.65% 55.02%
CF20 17.08 17.09
CA20 16.37 16.49
CF% 51.05% 50.90%

As you can see, that is not a huge difference. If we gave Filppula the same QoC as Datsyuk instead of being a 55.65% GF% player he’d be a 55.02% GF% player. That is hardly enough to worry about and the difference in CF% is even less.

From this an any other study I have looked at I have found very little evidence that QoC has a significant impact on a players statistics. The argument that a player can have bad stats because he plays the ‘tough minutes’ is, in my opinion, a bogus argument. Player usage can have a small impact on a players statistics but it is not anything to be concerned with for the vast majority of players and it will never make a good player have bad statistics or a bad player have good statistics. Player usage charts (such as those found here or those found here) are interesting and pretty neat and do give you an idea of how a coach uses his players but as a tool for justifying a players good, or poor, performance they are not. The notion of ‘tough minutes’ exists, but are not all that important over the long haul.

 

 

Mar 202013
 

I generally think that the majority of people give too much importance to quality of competition (QoC) and its impact on a players statistics but if we are going to use QoC metrics let’s at least try and use the best ones available. In this post I will take a look at some QoC metrics that are available on stats.hockeyanalysis.com and explain why they might be better than those typically in use.

OppGF20, OppGA20, OppGF%

These three stats are the average GF20 (on ice goals for per 20 minutes), OppGA20 (on ice goals against per 20 minutes) and GF% (on ice GF / [on ice GF + on ice GA]) of all the opposition players that a player lined up against weighted by ice time against. In fact, these stats go a bit further in that they remove the ice time the opponent players played against the player so that a player won’t influence his own QoC (not nearly as important as QoT but still a good thing to do). So, essentially these three stats are the goal scoring ability of the opposition players, the goal defending ability of the opposition players, and the overall value of the opposition players. Note that opposition goalies are not included in the calculation of OppGF20 as it is assume the goalies have no influence on scoring goals.

The benefits of using these stats are they are easy to understand and are in a unit (goals per 20 minutes of ice time) that is easily understood. GF20 is essentially how many goals we expect the players opponents would score on average per 20 minutes of ice time. The drawback from this stat is that if good players play against good players and bad players play against bad players a good player and a bad player may have similar statistics but the good players is a better player because he did it against better quality opponents. There is no consideration for the context of the opponents statistics and that may matter.

Let’s take a look at the top 10 forwards in OppGF20 last season.

Player Team OppGF20
Patrick Dwyer Carolina 0.811
Brandon Sutter Carolina 0.811
Travis Moen Montreal 0.811
Carl Hagelin NY Rangers 0.806
Marcel Goc Florida 0.804
Tomas Plekanec Montreal 0.804
Brooks Laich Washington 0.800
Ryan Callahan NY Rangers 0.799
Patrik Elias New Jersey 0.798
Alexei Ponikarovsky New Jersey 0.795

You will notice that every single player is from the eastern conference. The reason for this is that the eastern conference is a more offensive conference. Taking a look at the top 10 players in OppGA20 will show the opposite.

Player Team OppGF20
Marcus Kruger Chicago 0.719
Jamal Mayers Chicago 0.720
Mark Letestu Columbus 0.721
Andrew Brunette Chicago 0.723
Andrew Cogliano Anaheim 0.723
Viktor Stalberg Chicago 0.724
Matt Halischuk Nashville 0.724
Kyle Chipchura Phoenix 0.724
Matt Belesky Anaheim 0.724
Cory Emmerton Detroit 0.724

Now, what happens when we look at OppGF%?

Player Team OppGF%
Mike Fisher Nashville 51.6%
Martin Havlat San Jose 51.4%
Vaclav Prospal Columbus 51.3%
Mike Cammalleri Calgary 51.3%
Martin Erat Nashville 51.3%
Sergei Kostitsyn Nashville 51.3%
Dave Bolland Chicago 51.2%
Rick Nash Columbus 51.2%
Travis Moen Montreal 51.0%
Patrick Marleau San Jose 51.0%

There are predominantly western conference teams with a couple of eastern conference players mixed in. The reason for this western conference bias is that the western conference was the better conference and thus it makes sense that the QoC would be tougher for western conference players.

OppFF20, OppFA20, OppFF%

These are exactly the same stats as the goal based stats above but instead of using goals for/against/percentage they use fenwick for/against/percentage (fenwick is shots + shots that missed the net). I won’t go into details but you can find the top players in OppFF20 here, in OppFA20 here, and OppFF% here. You will find a a lot of similarities to the OppGF20, OppGA20 and OppGF% lists but if you ask me which I think is a better QoC metric I’d lean towards the goal based ones. The reason for this is that the smaller sample size issues we see with goal statistics is not going to be nearly as significant in the QoC metrics because over all opponents luck will average out (for every unlucky opponent you are likely to have a lucky one t cancel out the effects). That said, if you are doing a fenwick based analysis it probably makes more sense to use a fenwick based QoC metric.

HARO QoC, HARD QoC, HART QoC

As stated above, one of the flaws of the above QoC metrics is that there is no consideration for the context of the opponents statistics. One of the ways around this is to use the HockeyAnalysis.com HARO (offense), HARD (defense) and HART (Total/Overall) ratings in calculating QoC. These are player ratings that take into account both quality of teammates and quality of competition (here is a brief explanation of what these ratings are).The HARO QoC, HARD QoC and HART QoC metrics are simply the average HARO, HARD and HART ratings of players opponents.

Here are the top 10 forwards in HARO QoC last year:

Player Team HARO QoC
Patrick Dwyer Carolina 6.0
Brandon Sutter Carolina 5.9
Travis Moen Montreal 5.8
Tomas Plekanec Montreal 5.8
Marcel Goc Florida 5.6
Carl Hagelin NY Rangers 5.5
Ryan Callahan NY Rangers 5.3
Brooks Laich Washington 5.3
Michael Grabner NY Islanders 5.2
Patrik Elias New Jersey 5.2

There are a lot of similarities to the OppGF20 list with the eastern conference dominating. There are a few changes, but not too many, which really is not that big of a surprise to me knowing that there is very little evidence that QoC has a significant impact on a players statistics and thus considering the opponents QoC will not have a significant impact on the opponents stats and thus not a significant impact on a players QoC. That said, I believe these should produce slightly better QoC ratings. Also note that a 6.0 HARO QoC indicates that the opponent players are expected to produce a 6.0% boost on the league average GF20.

Here are the top 10 forwards in HARD QoC last year:

Player Team HARD QoC
Jamal Mayers Chicago 6.0
Marcus Kruger Chicago 5.9
Mark Letestu Columbus 5.8
Tim Jackman Calgary 5.3
Colin Fraser Los Angeles 5.2
Cory Emmerton Detroit 5.2
Matt Belesky Anaheim 5.2
Kyle Chipchura Phoenix 5.1
Andrew Brunette Chicago 5.1
Colton Gilles Columbus 5.0

And now the top 10 forwards in HART QoC last year:

Player Team HART QoC
Dave Bolland Chicago 3.2
Martin Havlat San Jose 3.0
Mark Letestu Columbus 2.5
Jeff Carter Los Angeles 2.5
Derick Brassard Columbus 2.5
Rick Nash Columbus 2.4
Mike Fisher Nashville 2.4
Vaclav Prospal Columbus 2.2
Ryan Getzlaf Anaheim 2.2
Viktor Stalberg Chicago 2.1

Shots and Corsi based QoC

You can also find similar QoC stats using shots as the base stat or using corsi (shots + shots that missed the net + shots that were blocked) on stats.hockeyanalysis.com but they are all the same as above so I’ll not go into them in any detail.

CorsiRel QoC

The most common currently used QoC metric seems to be CorsiRel QoC (found on behindthenet.ca) but in my opinion this is not so much a QoC metric but a ‘usage’ metric. CorsiRel is a statistic that compares the teams corsi differential when the player is on the ice to the teams corsi differential when they player is not on the ice.  CorsiRel QoC is the average CorsiRel of all the players opponents.

The problem with CorsiRel is that good players on a bad team with little depth can put up really high CorsiRel stats compared to similarly good players on a good team with good depth because essentially it is comparing a player relative to his teammates. The more good teammates you have, the more difficult it is to put up a good CorsiRel. So, on any given team the players with a good CorsiRel are the best players on team team but you can’t compare CorsiRel on players on different teams because the quality of the teams could be different.

CorsiRel QoC is essentially the average CorsiRel of all the players opponents but because CorsiRel is flawed, CorsiRel QoC ends up being flawed too. For players on the same team, the player with the highest CorsiRel QoC plays against the toughest competition so in this sense it tells us who is getting the toughest minutes on the team, but again CorsiRel QoC is not really that useful when comparing players across teams.  For these reasons I consider CorsiRel QoC more of a tool to see the usage of a player compared to his teammates, but is not in my opinion a true QoC metric.

I may be biased, but in my opinion there is no reason to use CorsiRel QoC anymore. Whether you use GF20, GA20, GF%, HARO QoC, HARD QoC, and HART QoC, or any of their shot/fenwick/corsi variants they should all produce better QoC measures that are comparable across teams (which is the major draw back of CorsiRel QoC.

 

Mar 142013
 

I often see people using zone starts and/or quality of competition as a way to justify any players unexpectedly poor or unexpectedly good play. Player X has a bad goal or corsi ratio because he plays all the tough minutes (i.e. the defensive zone starts and against the oppositions best lines). I am pretty certain that quality of competition is vastly over emphasized (everyone plays against everyone to some extent) and is vastly overshadowed by individual skill and quality of teammates, and I think zone starts do as well.

Eric Tulsky at NHL Numbers.com posted a good review of the research into the zone start effects on corsi statistics and I recommend people give that a read. I want to look into the issue a little further though. Most of the attempts to identify the impact of zone starts on a players stats have been inferred by looking at the league-wide correlations or by actual counting of how many shots are taken after a zone face off. Both of these have their faults. As Eric Tulsky pointed out, taking a correlation of every players corsi with their zone start stats doesn’t take into account that it is the top line players that usually get the offensive zone starts and thus this likely over estimates the impact as these players do take more shots regardless of their zone start. Eric Tulsky also took the time to count the number of fenwick events that occur between an offensive zone face off and the time the puck leaves the offensive zone and estimated that to be 0.31. This would imply that every extra offensive zone start a player takes is worth 0.31 fenwick events. Of course, this doesn’t take into account that the best offensive players in the league typical get more  offensive zone starts but it also doesn’t consider what happens after the puck leaves the zone. If the puck leaves the zone under the opposing teams control there is probably a negative fenwick effect for the next several seconds of play reducing the 0.31 number further.

I want to get beyond these issues by taking a look at how zone starts affect individual players. I have previously argued that after 10 seconds of an offensive/defensive zone face off the majority of the benefit (or penalty) of an offensive (or defensive zone) face off has worn off. I wanted to take it a bit further to be sure that there is no residual effect and chose to conduct this analysis using a 45 second cut off. So, any time within 45 seconds of an offensive or defensive zone face off with no other stoppages in play will be eliminated in my face off adjusted data. This should eliminate pretty much every second of every shift that started with an offensive or defensive zone face off leaving just the play that occurred after a neutral zone face off or on the fly changes. I am going to call this ice time F45 ice time and it will represent ice time that is not in any way affected by zone starts. With this in mind, I will take a look at the differences between straight 5v5 stats and the F45 stats and the differences will give me an indication of how significant zone starts impact a players stats.

To do this I will look at both corsi for and corsi against stats on a per 20 minutes of ice time basis. It should be noted that corsi rates are about 7.5% higher during the f45 play (goal rates are ~15% higher!) so I will reduce the f45 corsi rates by 7.5% to account for this and conduct a fair comparison (previous zone start studies may have been impacted by this as well). Now, let’s take a look at eight players (Manny Malhotra, Dave Bolland, Brian Boyle, Jay McClement, Tanner Glass, Brandon Sutter, Adam Hall, and Taylor Pyatt) with an excess of defensive zone starts.

OZ% DZ% OZ%-DZ% FF20 FA20 FF%
Malhotra 12.2 54.6 -42.4 -3.09% 1.09% -1.0%
Bolland 19.8 40.5 -20.7 8.94% -5.25% 3.5%
B. Boyle 21.0 40.2 -19.2 2.87% 8.74% 0.3%
McClement 24.8 41.9 -17.1 -0.31% 1.34% -0.4%
Glass 20.5 37.1 -16.6 4.39% -6.00% 2.6%
Sutter 23.1 36.6 -13.5 -2.67% 2.32% -1.2%
Hall 20.7 33.9 -13.2 -4.06% 4.59% -2.2%
Pyatt 24.0 36.4 -12.4 0.38% -0.25% 0.2%
Average 20.8 40.2 -19.4 0.81% 0.82% 0.23%

The FF20 and FA20 columns show the % change in from 5v5 play to F45 play and the FF% column shows the 5v5 FF% – F45 FF%. The averages are a straight average, not weighted for ice time or zone starts. For players that have a significant defensive zone bias we would expect their F45 play to exhibit an increase in FF20 and a decrease in FA20 resulting in an increase in FF%. In bold are the circumstances where this in fact did happen. As you can see, this isn’t the majority of the time. It is actually kind of surprising that these heavily defensive zone start biased players didn’t see a significant and systematic improvement in their fenwick rates.

Now, let’s take a look at eight players (Henrik Sedin, Patrick Kane, Maian Gaborik, Justin Abdelkader, Kyle Wellwood, Tomas Vanek, John Tavares, Jason Arnott) who had a heavy offensive zone start bias.

OZ% DZ% OZ%-DZ% FF20 FA20 FF%
H. Sedin 49.3 16.2 33.1 -3.72% 1.81% -1.4%
P. Kane 41.4 20.3 21.1 5.94% 4.66% 0.3%
Gaborik 39.0 22.8 16.2 0.60% 2.32% -0.4%
Abdelkader 37.5 26.0 11.5 3.93% 3.49% 0.1%
K. Wellwood 36.9 27.6 9.3 4.54% -2.32% 1.7%
Vanek 36.2 27.2 9.0 -3.39% 1.06% -1.1%
Tavares 35.8 27.2 8.6 -2.39% 1.83% -1.0%
Arnott 36.4 28.0 8.4 -3.41% 1.81% -1.3%
Average 39.1 24.4 14.7 0.26% 1.83% -0.39%

For offensive zone start biased players we would expect to see their FF20 decrease, FA20 increase and FF% decrease when we remove their zone start bias. This is mostly true for FA10 (only Wellwood deviated from expectations) but less true for FF20 and FF% and overall the adjustments were relatively minor. Henrik Sedin had the greatest negative impact to his FF% but it only took him from a 55.2% fenwick player to a 53.8% fenwick player which is still pretty good. This could very well be an upper bound on the benefit of excessive offensive zone starts.

Eric Tulsky also presented a paper at the recent Sloan Sports Analytics Conference in which he suggested that a successful zone entry via carrying the puck in is worth upwards of 0.60 fenwick and upwards of 0.28 fenwick on a dump in. As pointed out earlier, Eric Tulsky counted o.31 fenwick between an offensive zone face off and the puck clearing the zone so and if the other team is clearing the zone with control of the puck, it is certainly possible that they will generate almost as many shots on their subsequent counter-rush essentially negating much of the benefit of the offensive zone start. Without studying zone exits and how frequently zone exists result in successful zone entries into opposing teams end we won’t know for sure, but the data shown above indicates that this might be the case.

The next question that might be worth exploring is, if there is no significant benefit to starting your offensive players in the offensive zone, is there a penalty? For example, might it be better for the Canucks to start the Sedin’s solely in the defensive and neutral zones on the theory that their talent with the puck will allow them to more frequently carry the puck into the offensive zone which, as Eric Tulsky showed, more frequently results in shots and goals. I am not certain of that but might be worthy of further investigation.  I suspect again any benefit/penalty of any zone start deployment will largely be overshadowed by the players individual ability and the quality of their line mates. The ability to win puck battles, control the puck and move it up the ice is the real driver of stats, not usage of the player.

All of this is to say that coaching strategy (at least player usage strategy) is probably not a significant factor in the statistical performance of the players or the outcomes of games and I suspect, as I previously found, the majority of the benefit of an offensive zone start is those situations where you win a face off, take a shot resulting in a goal or the goalie catching it or covering it for another face off.  If the play goes beyond that individual talent (puck retrieval for example) takes over and the opposition will get an opportunity to counter attack. This is why, as I previously determined, eliminating the first 10 seconds after a face off is sufficient for eliminating the majority of the effects of a zone start and even then, the effects are probably not as significant as we think they should be.

 

Mar 112013
 

There has been a fair bit of talk recently about Tyler Bozak and what the Leafs should do with him as he is clearly not suited for his #1C role but is set to be a UFA this summer and if the Leafs intend to keep him he’ll need a new contract.  To get an idea of his worth, I decided to see if I could identify a few comparable players.

Let’s start off offensively. The first thing I looked at was primary points per 60 minutes of 5v5 ice time (primary points = goals + first assists). From last year through this past weekend’s games Bozak had a PrPts/60of 1.085 so as an initial cut off I pared down the list of comparable players to forwards a PrPts/60 of between 1.00 and 1.20 and who have had at least 1000 minutes of ice time. There are some pretty good players in this list such as Ryan Getzlaf, Stephen Weiss, Tomas Plekanec and Daniel Breiere but there are some less talented players like Eric Nystrom and Marcel Goc.

The next thing I considered is Primary Points Percentage (PrPts%), or the percentage of goals scored while the player was on the ice. Tyler Bozak’s PrPts% is a relatively weak 41.24% (Getzlaf, for example, is 52.38% and Plekanec’s is 56.22%). I then pared down the list to just include centers and this is what I came up with as comparable offensive centers, sorted by PrPts%.

Player Team PPts/60 PrPts%
NIELSEN, FRANS NY Islanders 1.091 47.98%
SMITH, ZACK Ottawa 1.008 46.67%
VERMETTE, ANTOINE Phoenix 1.173 46.55%
LETESTU, MARK Columbus 1.138 46.32%
NUGENT-HOPKINS, RYAN Edmonton 1.182 46.14%
ZUBRUS, DAINIUS New Jersey 1.12 45.31%
KRUGER, MARCUS Chicago 1.115 43.78%
HANZAL, MARTIN Phoenix 1.078 42.27%
STAJAN, MATT Calgary 1.064 41.87%
BOZAK, TYLER Toronto 1.085 41.24%
KOIVU, SAKU Anaheim 1.15 38.49%

That is a list of mostly 2nd and 3rd line centers along with not yet fully developed Nugent-Hopkins. So, what about Bozak defensively? To evaluate defensive play I looked at the players 5v5 corsi events against per 20 minutes (CA20) and the ratio of the players CA20 vs his team mates CA20 when they are not playing with him (TMCA20). This gives us an indication of whether their team mates are improving their defensive stats while on the the ice with the player.

Player Name Team CA20 CA20/TMCA20
ZUBRUS, DAINIUS New Jersey 14.309 0.77
LETESTU, MARK Columbus 17.034 0.90
STAJAN, MATT Calgary 17.312 0.91
HANZAL, MARTIN Phoenix 18.122 0.93
VERMETTE, ANTOINE Phoenix 17.762 0.97
NIELSEN, FRANS NY Islanders 18.307 1.01
KOIVU, SAKU Anaheim 17.114 1.02
SMITH, ZACK Ottawa 18.771 1.04
KRUGER, MARCUS Chicago 15.940 1.05
BOZAK, TYLER Toronto 21.155 1.08

For CA20/TMCA20, the lower the number the better as this indicates their line mates CA20 is better with the player than not with the player. Bozak ranks dead last in this category and also ranks dead last (by a significant margin) in CA20.

So, what does this tell us about Tyler Bozak?  Well, it probably means he has 3rd line offensive ability but it is very questionable whether he is good enough defensively be a useful 3rd liner. As for the best comparable to Tyler Bozak, I’d have to say either Marcus Kruger or Matt Stajan or maybe Frans Nielsen but Bozak is probably somewhat below all of them in terms of value due to his poor defensive play.

 

Mar 062013
 

One of the surprise player performances so far this season is that of Jakub Voracek. Voracek currently sits tied for 7th in points with 10 goals and 27 points in 24 games.  That puts him on pace to score 54 points in this lock-out shortened 48 game season which is 4 points more than he has scored in any 82 game season (career best was  50 points in 2009-10 in 81 games).

Last season when Rick Nash was on the trade block I wrote an article about Nash and in it I had a few comments about Jakub Voracek as part of a WOWY analysis. Here is what I wrote:

Nash played best when he was paired up with Voracek and Brassard and only Voracek, Brassard and Huselius made Nash a better offensive player when playing with him.  Vermette, Umberger and Malhotra were drags on his offensive numbers.  When playing apart, Voracek’s numbers are better than Nash’s.  Same for Brassard’s (who is doing it again this year, 0.782 GF20 vs Nash’s 0.613 when apart).  As an aside, the numbers suggest that Voracek is a very good offensive player  and it was probably a big mistake to trade him.  It also suggest that the Flyers aren’t getting full value from him by playing him primarily with Maxime Talbot.  If someone acquired Voracek and put him in the right situations, he could be the next Joffrey Lupul.

Voracek wasn’t traded but the departure of James van Riemsdyk and Jaromir Jagr opened up some spots on the top two lines and Voracek got a promotion from playing mostly with Talbot to playing with Claude Giroux and getting lots of powerplay time.  The results of that move are, as I predicted, very Joffrey Lupul like. Lupul put up solid but unspectacular numbers while mostly been given second line minutes and secondary power play minutes for the majority of his career. Lupul’s numbers looked unspectacular but were actually quite good considering his usage as a secondary offensive player and the quality of line mates he played with. When Lupul came to Toronto and was put on a line with another elite offensive player, given first line minutes, and first power play unit minutes, he started putting up high end offensive numbers. It wasn’t so much that Lupul had a break out season or that he had a career year, its more than he was finally given an opportunity to play with top end talent and given first line minutes.  The exact same thing happened with Voracek.  He put up solid numbers while given secondary minutes in secondary offensive roles and just needed to be given a chance to prove his worth as a first line player with quality line mates. Now he has been given that chance and the results are clear. He is a high end offensive talent.

 

Feb 272013
 

The last several days I have been playing around a fair bit with team data and analyzing various metrics for their usefulness in predicting future outcomes and I have come across some interesting observations. Specifically, with more years of data, fenwick becomes significantly less important/valuable while goals and the percentages become more important/valuable. Let me explain.

Let’s first look at the year over year correlations in the various stats themselves.

Y1 vs Y2 Y12 vs Y34 Y123 vs Y45
FF% 0.3334 0.2447 0.1937
FF60 0.2414 0.1635 0.0976
FA60 0.3714 0.2743 0.3224
GF% 0.1891 0.2494 0.3514
GF60 0.0409 0.1468 0.1854
GA60 0.1953 0.3669 0.4476
Sh% 0.0002 0.0117 0.0047
Sv% 0.1278 0.2954 0.3350
PDO 0.0551 0.0564 0.1127
RegPts 0.2664 0.3890 0.3744

The above table shows the r^2 between past events and future events.  The Y1 vs Y2 column is the r^2 between subsequent years (i.e. 0708 vs 0809, 0809 vs 0910, 0910 vs 1011, 1011 vs 1112).  The Y12 vs Y23 is a 2 year vs 2 year r^2 (i.e. 07-09 vs 09-11 and 08-10 vs 10-12) and the Y123 vs Y45 is the 3 year vs 2 year comparison (i.e. 07-10 vs 10-12). RegPts is points earned during regulation play (using win-loss-tie point system).

As you can see, with increased sample size, the fenwick stats abilitity to predict future fenwick stats diminishes, particularly for fenwick for and fenwick %. All the other stats generally get better with increased sample size, except for shooting percentage which has no predictive power of future shooting percentage.

The increased predictive nature of the goal and percentage stats with increased sample size makes perfect sense as the increased sample size will decrease the random variability of these stats but I have no definitive explanation as to why the fenwick stats can’t maintain their predictive ability with increased sample sizes.

Let’s take a look at how well each statistic correlates with regulation points using various sample sizes.

1 year 2 year 3 year 4 year 5 year
FF% 0.3030 0.4360 0.5383 0.5541 0.5461
GF% 0.7022 0.7919 0.8354 0.8525 0.8685
Sh% 0.0672 0.0662 0.0477 0.0435 0.0529
Sv% 0.2179 0.2482 0.2515 0.2958 0.3221
PDO 0.2956 0.2913 0.2948 0.3393 0.3937
GF60 0.2505 0.3411 0.3404 0.3302 0.3226
GA60 0.4575 0.5831 0.6418 0.6721 0.6794
FF60 0.1954 0.3058 0.3655 0.4026 0.3951
FA60 0.1788 0.2638 0.3531 0.3480 0.3357

Again, the values are r^2 with regulation points.  Nothing too surprising there except maybe that team shooting percentage is so poorly correlated with winning because at the individual level it is clear that shooting percentages are highly correlated with goal scoring. It seems apparent from the table above that team save percentage is a significant factor in winning (or as my fellow Leaf fans can attest to, lack of save percentage is a significant factor in losing).

The final table I want to look at is how well a few of the stats are at predicting future regulation time point totals.

Y1 vs Y2 Y12 vs Y34 Y123 vs Y45
FF% 0.2500 0.2257 0.1622
GF% 0.2214 0.3187 0.3429
PDO 0.0256 0.0534 0.1212
RegPts 0.2664 0.3890 0.3744

The values are r^2 with future regulation point totals. Regardless of time frame used, past regulation time point totals are the best predictor of future regulation time point totals. Single season FF% is slightly better at predicting following season regulation point totals but with 2 or more years of data GF% becomes a significantly better predictor as the predictive ability of GF% improves and FF% declines. This makes sense as we earlier observed that increasing sample size improves GF% predictability of future GF% while FF% gets worse and that GF% is more highly correlated with regulation point totals than FF%.

One thing that is clear from the above tables is that defense has been far more important to winning than offense. Regardless of whether we look at GF60, FF60, or Sh% their level of importance trails their defensive counterpart (GA60, FA60 and Sv%), usually significantly. The defensive stats more highly correlate with winning and are more consistent from year to year. Defense and goaltending wins in the NHL.

What is interesting though is that this largely differs from what we see at the individual level. At the individual level there is much more variation in the offensive stats indicating individual players have more control over the offensive side of the game. This might suggest that team philosophies drive the defensive side of the game (i.e. how defensive minded the team is, the playing style, etc.) but the offensive side of the game is dominated more by the offensive skill level of the individual players. At the very least it is something worth of further investigation.

The last takeaway from this analysis is the declining predictive value of fenwick/corsi with increased sample size. I am not quite sure what to make of this. If anyone has any theories I’d be interested in hearing them. One theory I have is that fenwick rates are not a part of the average GMs player personal decisions and thus over time as players come and go any fenwick rates will begin to vary. If this is the case, then this may represent an area of value that a GM could exploit.

 

Feb 182013
 

I have some new and exciting enhancements to stats.hockeyanalysis.com for you all today. Charts, Charts, and more Charts.

Before we get to the charts though, let me also mention that I have made some modifications to my HARO, HARD and HART ratings. Most of the change is to the scale and presentation and not so much to the actual formula (though there were some tweaks there too). Instead of 1.00 being an average hockey player, 0 is and the scale has been multiplied by 100 to represent % as opposed to a ratio. So now one should interpret [Shot,Fenwick,Corsi]HARO offensive ratings to mean that when the player was on the ice his team had x% (where x is his rating) more goals [shots, fenwick, corsi] for than expected (as determined by his quality of team mates and quality of competition). This means that a positive value means more goals were scored than expected and a negative value means less goals were expected. A positive value indicates the player boosted his teams offensive performance while a negative value means he was a drag to his teams offense.

For defensive [Shot,Fenwick,Corsi]HARD ratings the effect is opposite. One should interpret the HARD ratings to mean that when the player is on the ice his team gave up x% (where his rating is x) fewer goals [shots, fenwick, corsi] than expected (as determined by quality of teammates and opposition).  So, a 10 HARO rating indicates the player boosted his teams expected goal scoring rate by 10% and a 10 HARD rating indicates the player reduced his teams expected goals against rate by 10%.  The [Shot,Fenwick,Corsi]HART ratings are simply the average of the HARO and HARD ratings.

Now on to the more exciting news, the charts. We all love charts so I have added a bunch for you all to enjoy. When you go to a player page now (i.e. Zdeno Chara) you will find a link named Visualize performance over time. Clicking this link will give you a visual representation of the players performance over the past several seasons starting in 2007-08 if their careers were active then. For example, here is Zdeno Chara’s performance charts. For forwards and defensemen there are 5 charts.

  1. Point production (G/60, A/60, First A/60 and Points/60)
  2. Individual shot, fenwick and corsi rates (shot/60, ifenwick/60, icorsi/60)
  3. HARO, HARD, FenHARO and FenHARD ratings
  4. GoalsFor%, ShotsFor%, FenwickFor% and CorsiFor%
  5. Zone Start %

This should give you a quick visualization of each players performance and how it has changed over time.

For goalies (i.e. Roberto Luongo) the only chart I have right now are 5v5 Zone Start Adjusted Save percentages.

Maybe the charts that will generate the most interest though are the new WOWY charts (sure to make you scream “WOWY!!!”). To access the WOWY charts you simply need to go to a WOWY data page and click on the “Visualize This Table” link at the top of the WOWY table (only for ‘with you’ WOWY, not ‘against you’). This will give you two WOWY bubble charts.  The first one plots teammate ‘with you’ GF% across the horizontal axis and teammate ‘without you’ GF% across the vertical access. The second chart is the same but plots CF% instead. The size of the bubbles are relative to the total TOI With.

In these plots good players will have the majority of their teammates bubbles show up below or to the right of the diagonal line from the bottom left corner to the top right corner and bad players will have the majority of their teammates above or to the left of that line. Players with a lot of teammates in the bottom right quadrant are really good because they are taking sub par players and making them look good. Players with a lot of teammates in the upper left quadrant are  bad because they make good players look bad.

For a look at two polar opposite players, take a look at Zdeno Chara’s WOWY charts compared to Jack Johnson’s WOWY charts (I have linked to the 3 year 5v5 ZS adjusted WOWY charts). Also, on Saturday I wrote a post about how bad Tyler Bozak is and if you want more evidence of that have a look at his 2 year WOWY charts. I am slowly becoming a big believer that WOWY’s are where it is at in evaluating players (though I guess I have always been a believer as this is the core of my HARO, HARD, and HART ratings). The great players are the ones who consistently make their team mates better. The good players are the ones who can really capitalize playing with great players and don’t hold them back. The bad players are those who act as drags on their team mates. These WOWY charts are a quick and easy way of visualizing the different types of players. For the Leafs, Grabovski fits into the ‘great’ category, Kessel into the ‘good’ category and Bozak into the bad.

I have a few more ideas of some charts and tables to add (I’d got some ideas for some more ‘usage’ type charts) but I think this will be the last major update for a while. That said, if you have any ideas of what you would like to see added definitely let me know and I’ll see what I can do. As for updating of the 2012-13 stats, it should be noted that they aren’t updated daily.  I have been trying (fairly successfully so far) to update them every Monday, Wednesday and Friday mornings and I hope to continue that but no guarantees.

Update: I know I said I wouldn’t do any more updates but I have made the WOWY charts better by adding WOWY charts for GF20, GA20, CF20 and CA20. Now we can easily see where a players strengths and weaknesses are (i.e. offense vs defense).

 

Feb 012013
 

Last week I introduced player TOI usage charts and one use I thought they had was to look at how a players usage changed during the downside of their careers. Today I will do just that by looking at Nicklas Lidstrom’s TOI charts over the last 5 seasons. Consider this an extension to my earlier article where I took a look at Lidstrom’s last few seasons of his career. Let’s get right at it with his 5v5 chart.

LidstromTOIChart

 

Lidstrom’s last big season was clearly 2007-08 and every year since he has been below his 2007-08 levels in terms of 5v5 ice time. What is interesting to note is how little (relatively) ice time he had during the 2010-11 season, the year he won the Norris Trophy. I think it was a big mistake that he was awarded the trophy that season and this is just a little more evidence of that. In fact, Lidstrom was 4th on the Red Wings in ESTOI/Game by defensemen which is why his TOI% in the chart above were so low that year. Rafalski retired in the summer of 2011 which meant Lidstrom would get a boost to his ice time in 2011-12.

So, what about his special teams play?

LidstromPPPKTOIChart

On the powerplay, Lidstrom maintained his level of playing ~60% of his teams 5v4 power play minutes but his penalty kill ice time dropped significantly over the final 2 seasons of his career.

Based on the above charts, the last year I think you could consider Lidstrom a true heavy work load stud of a defenseman was in 2007-08. He was still awfully good for a couple more years and quite good until he retired but his slow decline in ice time had begun.

 

Jan 302013
 

For those familiar with my history, I have been a big proponent that there is more to the game of hockey than corsi and that players can certainly drive on-ice shooting percentage. I have not done much work at the team level, but now that I have team stats up at stats.hockeyanalysis.com I figured I’d take a look.

Since shooting percentages can vary significantly over small sample sizes, my goal was to use the largest sample size possible.  As such, I used 5 years of team data (2007-08 through 2011-12) and looked at each teams shooting and save percentages over that time. During those 5 years Vancouver led all teams in 5v5 ZS adjusted save percentage shooting at 10.69% while Columbus trailed all teams with a 8.61% shooting percentage. What’s interesting to note is the top 6 teams are Vancouver, Washington, Chicago, Philadelphia, Boston and Pittsburgh, all what we would consider the teams with the best offensive talent in the league. Meanwhile, the bottom 5 teams are Columbus, Los Angeles, Phoenix, Carolina, and Minnesota, all teams (except maybe Carolina) more associated with defensive play and a defense-first system.

As far as save percentage goes, Phoenix led the league with a 91.83% save percentage while the NY Islanders trailed with an 89.04% save percentage. The top 5 teams were Phoenix, Boston, Anaheim, Nashville, and Montreal.  The bottom 5 teams were NY Islanders, Tampa, Toronto, Chicago and Ottawa. Not surprises there.

As far as sample size goes, teams on average had 7,627 shots for (or against) over the course of the 5 years which gives us a reasonable large sample size to work with.

Now, in order to not use an extreme situation, I decided to compare the 5th best team to the 5th worst team in each category and then determine the probability that their deviations from each other are solely due to randomness.  This meant I was comparing Boston to Minnesota for shooting percentage and Montreal to Ottawa for save percentage.

TeamShootingPercentageComp

As you can see, there isn’t a lot of overlap, meaning there isn’t a large probability that luck is the reason for the difference between these two teams 5 year save percentages.  In fact, the intersecting area under the two curves amounts to just a 6.2% chance that the differences are luck driven.  That’s pretty small and the differences between the teams above Boston and below Minnesota would be greater. I think we can be fairly certain that there are statistically significant differences between teams 5 year shooting percentages and considering how much player movement and coaching changes there are over the span of 5 years it makes it that much more impressive. Single seasons differences could in theory (and probably likely are) more significant.

TeamSavePercentageComp

The save percentage chart provides even stronger evidence that there are non-luck factors at play.  The intersecting area under the curves equates to a 2.15% chance that the differences are due to luck alone. There is easily a statistically significant differences between Ottawa and Montreal’s 5 year save percentages. Long-term team save percentages are not luck driven!

So, the next question is, how much does it matter?  Well, the average team takes approximately 1500 5v5 ZS adjusted shots each season. The differences in shooting percentage between the 5th best team and the 5th worst team is 1.27% so that would equate to a difference of 19 goals per year during 5v5 ZS adjusted situations. The difference between the 5th best and 5th worst team in save percentage is 1.5% which equates to a 22.5 goal difference. These are not insignificant goal totals and they are likely driven solely by the percentages.

Now, how does this equate to differences in shot rates? If we take the team with the 5th highest shot rate and apply a league average shooting percentage and then compare it to the team with the 5th lowest shot rate we would find a difference of 17.5 goals over the course of a single season. This is slightly lower than what we saw for shooting and save percentages.

What is interesting is this (the percentages being more important than the shot rates) is not inconsistent with what we have seen at the individual level. In Tom Awad’s “What makes Good Players Good, Part I” post he identified 3 skills that good players differed from bad players. He identified the variation in +/- due to finishing as being 0.42 for finishing (shooting percentage), 0.08 for shot quality (shot location) and 0.30 for out shooting which would equate to out shooting being just 37.5% of the overall difference. I also showed that fenwick shooting percentage is more important than fenwick rates by a fairly significant margin.

Any player or team evaluation that doesn’t take into account the percentages or assumes the percentages are all luck driven is an evaluation that is not telling you the complete story.

 

Jan 252013
 

The last few days I have been looking at the percentage of a teams ice time for a given situation that a particular player is on the ice for.  So for instance, what percentage of the Leafs 5v5 even strength ice time was Joffrey Lupul on the ice in games in which Joffrey Lupul played. When I write a new program to calculate these numbers I need to to some testing to make sure the results are correct.  The first test is always the standard sniff test.  When the program runs I look at the output and ask myself “does the output make sense?”. When I first looked at the output the other day one of the numbers surprised me so much that I had to do some double checking to make sure it made sense. That number was the percentage of his teams power play ice time that Ilya Kovalchuk was on the ice for. That number was 87.25%.

That’s insane I thought so off to NHL.com to check and see if it could be at all possible. I first checked and noticed that the Devils had 439:59 minutes of PP ice time last year, including 420:36 minutes of 5v4 ice time. Next I checked out much PP ice time Kovalchuk had last year and see that he had 379:08 minutes of PP time. I do not know his exact 5v4 PP ice time numbers but 379:08 is about 86% of 439:59 so my calculation of Kovalchuk being on the ice for 87.25% of his teams PP ice time is perfectly within reason.

To me this seems like a crazy high number.  It means for every 2 minute penalty Kovalchuk is on the ice for 1:44 of it. That just makes me say “WOW!” but Kovalchuk is not alone in getting big PP minutes.  Here are some other players who have played in >70% of his teams 5v5 PP minutes (in games he played in) over the past 5 seasons.

Player 5v4 TOI%
Ilya Kovalchuk 87.25%
Alex Ovechkin 83.08%
Mike Green 76.86%
Mark Streit 75.35%
Sergei Gonchar 74.76%
Evgeni Malkin 73.83%
Sidney Crosby 73.01%
Dan Boyle 72.78%

I knew some players played a lot of PP ice time, but that still astonishes me. Oh, and for the record, in addition to being on the ice for 87.25% of his teams 5v4 PP ice time, Kovalchuk was on the ice for 89.66% of his teams 5v4 PP goals.

On the other end of things, over the last 5 years Willie Mitchell has played a whopping 59.2% of his teams 4v5 PK ice time which is might actually be more impressive considering how much more demanding playing on the PK is.