Feb 092014
 

There is a recently posted article on BroadStreetHockey.com discussing overused and overrated statistics. The first statistic on that list is Plus/Minus. Plus/minus has its flaws and gets wildly misused at times but it doesn’t mean it is a useless statistics if used correctly so I want to defend it a little but also put it in the same context as corsi.

The rational given in the BroadStreetHockey.com article for plus/minus being a bad statisitcs is that the top of the plus/minus listing is dominated by a few teams. They list the top 10 players in +/- this season and conclude:

Now there are some good players on the list for sure, but look a little bit closer at the names on the list. The top-ten players come from a total of five teams. The top eight all come from three teams. Could it perhaps be more likely that plus/minus is more of a reflection of a team’s success than specific individuals?

Now that is a fair comment but let me present you the following table of CF% leaders as of a few days ago.

Player Name Team CF%
MUZZIN, JAKE Los_Angeles 0.614
WILLIAMS, JUSTIN Los_Angeles 0.611
KOPITAR, ANZE Los_Angeles 0.611
ERIKSSON, LOUI Boston 0.606
BERGERON, PATRICE Boston 0.605
TOFFOLI, TYLER Los_Angeles 0.595
TOEWS, JONATHAN Chicago 0.592
THORNTON, JOE San_Jose 0.591
MARCHAND, BRAD Boston 0.591
ROZSIVAL, MICHAL Chicago 0.590
TARASENKO, VLADIMIR St.Louis 0.589
KING, DWIGHT Los_Angeles 0.589
BROWN, DUSTIN Los_Angeles 0.586
DOUGHTY, DREW Los_Angeles 0.584
BURNS, BRENT San_Jose 0.583
BICKELL, BRYAN Chicago 0.582
HOSSA, MARIAN Chicago 0.581
KOIVU, MIKKO Minnesota 0.580
SAAD, BRANDON Chicago 0.579
SHARP, PATRICK Chicago 0.578
SHAW, ANDREW Chicago 0.578
SEABROOK, BRENT Chicago 0.576

Of the top 22 players, 8 are from Chicago and 7 are from Los Angeles. Do the Blackhawks and Kings have 68% of the top 22 players in the NHL? If we are tossing +/- aside because it is “more of a reflection of a team’s success than specific individuals” then we should be tossing aside Corsi as well, shouldn’t we?

The problem is not that the top of the +/- list is dominated by a few teams it is that people misinterpret what it means and don’t consider the context surrounding a players +/-. No matter what statistic we use we must consider context such as quality of team, ice time, etc. Plus/minus is  no different in that regard.

There are legitimate criticisms of +/- that are unique to +/- but in general I think a lot of the criticisms and subsequent dismissals of +/- having any value whatsoever are largely unfounded. It isn’t that plus/minus is over rated or over used it is that it is often misued and misinterpreted and to be honest I see this happen just as much with Corsi and the majority of other “advanced” statistics as well. It isn’t the statistic that is the problem, it is the user of the statistic. That, unfortunately, will never change but that shouldn’t stop us who know how to use these statistics properly from using them to advance our knowledge of hockey. So please, can we stop dismissing plus/minus (and other stats) as a valueless statistics just because a bunch of people frequently misuse it.

The truth is there are zero (yes, zero) statistics in hockey that can’t and aren’t regularly misused and used without contextualizing. That goes from everything from goals and point totals to corsi to whatever zone start or quality of competition metric you like. They are all prone to be misused and misinterpreted and more often than not are. It is not because the statistics themselves are inherently flawed or useless its because hockey analytics is hard and we are a long long way from fully understanding all the dynamics at play. Some people are just more willing to dig deeper than others. That will never change.

 

(Note: This isn’t intended to be a critique of the Broad Street Hockey article because the gist of the article is true. The premise of the article is really about statistics needing context and I agree with this 100%. I just wish it wasn’t limited to stats like plus/minus, turnovers, blocked shots, etc. because advanced statistics are just as likely to be misused.)

 

May 012013
 

I brought this issue up on twitter today because it got me thinking. Many hockey analytics dismiss face off winning % as a skill that has much value but many of the same people also claim that zone starts can have a significant impact on a players statistics. I haven’t really delved into the statistics to investigate this, but here is what I am wondering.  Consider the following two players:

Player 1: Team wins 50% of face offs when he is on the ice and he starts in the offensive zone 55% of the time.

Player 2: Team wins 55% of face offs when he is on the ice but he has neutral zone starts.

Given 1000 zone face offs the following will occur:

Player 1 Player 2
Win Faceoff in OZone 275 275
Lose Faceoff in Ozone 275 225
Win Faceoff in DZone 225 275
Lose Faceoff in Dzone 225 225

Both of these players will win the same number of offensive zone face offs and lose the same number of defensive zone face offs which are the situations that intuitively should have the greatest impacts on a players statistcs. So, if Player 1 is going to be more significantly impacted by his zone starts than player 2 is impacted by his face off win % losing face offs in the offensive zone must still have a significant positive impact on the players statistics and winning face offs in the defensive zone must must still have a significant negative impact on the players statistics. If this is not the case then being able to win face offs should be more or less equivalent in importance to zone starts (and this is without considering any benefit of winning neutral zone face offs).

Now, I realize that there is a greater variance in zone start deployment than face off winning percentage, but if a 55% face off percentage is roughly equal to a 55% offensive zone start deployment and a 55% face off win% has a relatively little impact on a players statistics then a 70% zone start deployment would have a relatively little impact on the players statistics times four which is still probably relatively little.

I hope to be able to investigate this further but on the surface it seems that if face off win% is of relatively little importance it is supporting of my claim that zone starts have relatively little impact on a players statistics.

 

Jan 232013
 

One of the challenges in hockey analytics, or any type of data analysis, is how to best visualize data in a way that is exceptionally informative and yet really simple to understand. I have been working on a few things can came up with something that I think might be a useful tool to understand how a player gets utilized by his coach.

Let’s start with some background. We can get an idea of how a player is utilized by looking at when the player gets used and how frequently he gets used.  Offensive players get more ice time on the power play and more ice time when their team is trailing and needs a goal. Defensive players get more ice time on the PK and when they are protecting a lead. This all makes sense, but the issue is some teams spend more time on the PP or PK than others while bad teams end up trailing more than good teams and leading less. This means doing a straight time on ice comparison between players on different teams doesn’t always accurately depict the usage of the player. If a player on the Red Wings plays the same number of minutes with the lead as a player on the Blue Jackets it doesn’t mean the players are used int he same way.  The Blue Jackets will lead a game significantly less than the Red Wings thus in the hypothetical example above the Blue Jackets are depending on their player a higher percent of the time with a lead than the Red Wings are their player.

To get around this I looked at percentages. If Player A played 500 minutes with a lead and his team played a total of 2000 minutes with a lead during games which Player A played, then Players A’s ice time with a lead percentage would be 25%. In games in which Player A played he was used in 25% of the teams time leading. I can calculated these percentages for any situation from 5v5 to 4v5 or 5v4 special teams to leading and trailing situations. The challenge is to visualize the data in a clear and understandable way. To do this I use radar charts. Lets look at a couple examples so you get an idea and we’ll use players that have extreme and opposite usages: Daniel Sedin and Manny Malhotra.

For those not up to speed on my terminology f10 is zone start adjusted ice time which ignores the 10 seconds after a face off in either the offensive or defensive zone.

The charts above are largely driven by PP and PK ice time but players that are used more often in offensive roles will have their charts bulge to the top and top right while those in more defensive roles will have their charts bulge more to the bottom and bottom left. Also, the larger the ‘polygon’ the more ice time and more relied on the player is. In the examples above, Sedin is clearly used more often in offensive situations and clearly gets more ice time.

Let’s now look at a player who is used in a more balanced way, Zdeno Chara.

That is a chart that is representative of a big ice time player who plays in all situations. We can then take it a step further and compare players such as the following.

In normal 5v5 situations Gardiner was depended on about as much as Phaneuf, but Phaneuf was relied on a lot more on special teams and a bit more when protecting a lead. Of course, you can also compare across teams with these charts:

Phaneuf and Chara were depended on almost equally in all situations except on the PP where Phaneuf was used far more frequently.

I am not sure where I will go with these charts but I think I’ll look at them from time to time as I am sure they will be of use in certain situations and I have a few ideas as to how to expand on them to make them even more interesting/useful.

 

Jan 172013
 

Yesterday evening James Mirtle from the Globe and Mail posted an article on The Curious case of Tim Connolly and the Leafs.  It’s worth a read so go read it but the premise of the article is how the narrative around Tim Connolly in training camp is he had a poor year last year and he needs to perform better this year.  Makes sense from most peoples view points but Connolly tries to present a different perspective.

Connolly can be prickly to deal with and wasn’t particularly interested in talking about last season, but when pressed, you could tell he felt he did more of value than the narrative – that he’s been an unmitigated bust in Toronto – would suggest.

Here was his answer when asked (maybe for the second or third time) about needing to “rebound” this season.

“Even strength, I think I had my second highest career points last year,” Connolly said. “I’d like to improve my play on the power play and maybe play a bigger role. Penalty killing, I think, my individual percentage was 89 per cent I read somewhere. I was able to lead the forwards in blocked shots.”

He makes two points in there.  The first is that he had his second highest even strength points last year and the second was something about individual percentage was 89 percent. Lets deal with the first one first by looking at his even strength points since the first lockout.

Season Goals Assists Points
2011-12 11 20 31
2010-11 7 16 23
2009-10 9 27 36
2008-09 12 16 28
2007-08 3 20 23
2005-06 9 20 29

(Note: Connolly only played 2 games in 2006-07 so I have omitted it from the table and discussion)

Tim Connolly is actually correct.  His best even strength point total came in 2009-10 when he had 36 points followed by his 31 even strength points last year.  But let’s take a look at those point totals relative to even strength ice time.

Season ESTOI Points TOI/Pt
2011-12 940:12 31 30:20
2010-11 840:31 23 36:33
2009-10 966:41 36 26:51
2008-09 631:26 28 22:33
2007-08 603:18 23 26:14
2005-06 708:47 29 24:26

The last column is time on ice per point, or time on ice between points.  Last year he was on the ice for an average of 30 minutes and 20 seconds between each of his even strength points. This was his second worst since the locked out season. So, while Connolly was technically correct in saying that he had his second highest even strength point total last season, it was a somewhat misleading representation of his performance.

Now for the individual PK percent. It generated a bit of twitter conversation last night questioning what it actually is.

One might think it is the penalty kill percentage when he was on the ice but that seems like a strange thing to calculate.  Is it goals per 2 minutes of PK time?  Is it goals per PK he spent any amount of time killing?  I really didn’t know so I dug into the numbers deeper by looking at the Leafs PK percentages on my stats site and noticed that Connolly had the best on-ice save percentage (listed as lowest opposition shooting percentage) of any Leaf last season during 4v5 play and that save percentage while he was on the ice was just shy of 89% (88.68%). It seems that maybe what Connolly meant to say was that he had an on-ice PK save percentage of 89%.

How good is an 89% save percentage on the PK?  Well, of the 100 forwards with at least 100 4v5 minutes of ice time last year, Connolly ranks 42nd in the league so league wide it isn’t that impressive but considering the Leafs weak goaltending it might actually be fairly good.

Here is the thing though. Single season PK save percentage is so fraught with sample size issues that it is next to useless as a stat for goalies let alone forwards.

One could evaluate Connolly based on PK goals against rate in which he came up 3rd on the Leafs (trailing Lombardi or Kulemin) but that is still fraught with sample size issues. More fairly we probably should evaluate Connolly’s PK contribution based on shots against rate or maybe even more fairly fenwick or corsi against rates. In each of those categories he ranked 5th among Leafs with at least 50 minutes of 4v5 ice time with only Joey Crabb being worse. Furthermore, among the 110 players with 100 minutes of 4v5 PK ice time last year, Connolly ranked 99th in fenwick against rate.

I don’t mean for this article to be a Connolly bashing article. I actually do think Connolly was a little misused and would probably do better with a more well defined role and not bounced around in the line up so much so in that sense I agree with the premise of what Connolly is saying. With that said though, it probably is fair to say that he didn’t have a great season and if he wants a regular role in the top six with time on the PP and PK he needs to perform better as his use of stats to attempt to show he had a good season is really just evidence to how statistics can be misused to support almost any narrative you want.  As they say, there are lies, damn lies, and then there are statistics.

 

Aug 212011
 

I have just updated my stats site (stats.hockeyanalysis.com) to include a number of new features.  The added features are:

1.  I have added a new situation – 5v5close.  5v5close is when the game is tied or within 1 goal in the first and second period or tied in the third period.  This is what I would call normal play where teams are more or less (depending on talent or game play/coaching style) equally interested in  playing offense or defense.  When teams get a larger lead or lead late in the game teams adjust their style of play to either protect that lead or go all out to score a goal to catch up.  It is probably better to use this than 5v5tied and maybe better than 5v5 (all 5v5 game score situations).

2.  I have included zone start data in the form of OZOF%, DZOF% and NZOF%.  OZOF% is the percentage of face offs taken in the offensive zone when the player is on the ice and DZOF% and NZOF% are the same for defensive zone and neutral zone faceoffs.  When we look at these by situation we can get an idea of how a players use gets changed by game score.  For example, last year Manny Malholtra had 38.8% of his 5v5 face offs in the defensive zone (29.1% offensive zone and 32.1% neutral zone) but when the Canucks were up by a goal his defensive zone faceoffs rose to 41.6% and when the Canucks were up by 2 goals they rose to 48.4%.

3.  I have once again put up with/against statistics for each player.  I had this data up a few years ago but when I re-designed my website I removed it but it is back.  Each player page (i.e. the Malhotra one linked to above) has a set of links at the top of the page to with/against statistics for each season (and multi-seasons) for 5v5 and 5v5 close situations for both goal and corsi data.  Each page shows how the player played with each teammate as well as how they played when they were not playing together as well as how the player performed against each opponent and how well the player and the opponent performed when not playing together.  These tables can give you an indication of which players are playing together and which players play well together as well as who a player plays against the most.  As an example, take a look at Manny Malhotra 5v5 goal with/against data for this past season and you will see he played the most with Raffi Torres (even more than with Roberto Luongo!) but it seems both players had better on ice results when apart.

4.  If you hadn’t noticed yet, a while back I added on ice shooting percentage (Sh%) and on ice opposition shooting percentage (OppSh%, subtract from to get on ice save %) which can be found with the goal data (but not with corsi, fenwick and shot data).

All totaled, there is well over 10 gigabytes of html, php and data base files of statistics (90% of which is in the with/against tables) so be warned, if you really wanted to you could spend days looking at it all.

Apr 182011
 

By all accounts, Corey Perry had an exceptional season in 2010-11 and this is particularly true down the stretch when he flew by Steven Stamkos for the lead in goals scored and pushed himself into serious contention from the Hart Trophy as the leagues most valuable player.  There is no doubt that Perry’s production level surpassed anything he had previously done in his career, but was he truly more valuable to the Ducks than in previous seasons?  Let’s look at the numbers.

Season GP Goals Assists Points +/- PPG PPA PP Points
2010-11 82 50 48 98 9 14 17 31
2009-10 82 27 49 76 0 6 17 23
2008-09 78 32 40 72 10 10 14 24
2007-08 70 29 25 54 12 11 6 17

Based on the raw stats, he has been better in 2010-11 in terms of goal scoring and fairly consistent in terms of collecting assists but despite his increase in goals and points, his +/- hasn’t increased significantly.

Let’s look a little deeper into Perry’s even strength 5v5 statistics.

Season GF20 GA20 GF% TMGF20 TMGA20 TMGF% OppGF20 OppGA20 OppGF%
2010-11 0.928 0.882 0.513 0.876 0.843 0.510 0.774 0.745 0.509
2009-10 1.047 0.828 0.558 0.694 0.807 0.463 0.776 0.759 0.505
2008-09 1.113 0.754 0.596 0.712 0.775 0.479 0.756 0.751 0.501
2007-08 1.003 0.683 0.595 0.674 0.551 0.550 0.724 0.725 0.500

(source:  http://stats.hockeyanalysis.com/showplayer.php?pid=2)

For those unfamiliar with my terminology, GF20 is Perry’s goals for by team while on the ice per 20 minutes of ice time, GA20 is the same for goals against and GF% is GF20/(GF20+GA20) and represents what percentage of all goals scored while he was on the ice were scored by his team.  The TM stats are the same but for his team mates when they are not playing with Perry and the Opp stats are the same but for Perry’s opponents when they are not playing against Perry.

Now, the first observation you may make is that Perry’s GF20 was lower in 2010-11 than in any of the previous season so while Perry produced more offense (goals in particular) in 2010-11 individually, the team produced somewhat less when Perry was on the ice.  In other words, Perry’s goal/point production may have come at the cost of his line mates goal/point production.  The same thing is true defensively.  More goals were scored against Perry while Perry was on the ice than in any previous season.

Now, looking at team mate production when his teammates are not on the ice with Perry we find that they produce slightly fewer goals per 20 minutes (0.876 without Perry vs 0.928 with) but also give up slightly fewer goals too (0.843 without Perry, 0.882 with).  What is interesting though is Perry’s line mates this season appear to be better offensive players than in prior seasons as their 2010-11 GF20 was 0.876 vs 0.694 in 2009-10 though they also had a slightly higher GA20 in 2010-11 as well.  So from these numbers it seems that overall Perry played with significantly better offensive players in 2010-11 than in prior years and slightly worse defensive players in 2010-11 than in prior years.

As for quality of opposition, the offensive production of Perry’s opponents in 2010-11 was about the same as in 2009-10 while defensively they were slightly better.

So, in summary we can state that when Perry was on the ice in 5v5 even strength situations the Ducks produced less in 2010-11 than they did in 2009-10 and gave up more goals in 2011-10 than they did in 2009-10.  Furthermore, overall his line mates appear to have been significantly better offensive players in 2010-11 than in 2009-10 and only slightly worse defensive players while his opposition appears to be similarly skilled offensively and marginally less skilled defensively.

So, what does this all mean?  Here are Perry’s offensive and defensive ratings:

Season HARO+ HARD+ HART+
2010-11 1.164 0.852 1.008
2009-10 1.300 0.917 1.109
2008-09 1.391 0.953 1.172
2007-08 1.325 0.979 1.152

With all things considered, despite scoring 50 goals this past season, one could make an argument that 2010-11 was well below his performance during the three previous seasons.  It seems that his improved individual numbers may have come at the cost of his team mates and that made him less valuable to the Ducks overall.

Oct 202010
 

Gus Kastaros this morning posted some NHL overtime statistics on his twitter account this morning which got me digging into the stats a little more.

KatsHockey > Overtime on the other hand is at #NHL record setting pace of 208 games .. in past week (7 days) have been 8 OT games

KatsHockey > Only six shootouts in #NHL thus far & only two shootout games in past 42 games one week ago .. pace has dipped to post-lockout low of 96

If that wasn’t interesting itself, there have been 20 overtimes this year.  In the first 11 overtimes there were 3 power plays awarded and no overtime game winning power play goals were recorded.  One of those three power plays were given with just 7 seconds left in the OT so really there were only 2 full power plays in the first 11 over times.  Contrast that to the past 9 over times in which 6 power plays were awarded and 5 over time power play game winning goals were scored.  The power play that did not result in a goal was awarded with just 16 seconds left.
It is probably just a coincidence that 3 of the first 11 overtime games had an overtime powerplay and 6 of the next 9 did but it isn’t beyond the realm of possibility that the NHL, in an attempt to reduce the number of shootouts, issued a notice to the referees not let up in calling penalties in the overtime.  Four of the first 11 overtime games went to a shootout while 2 of the following 9 did.  If I get an opportunity I’ll dig a little deeper and compare what we have seen so far this season with past years data.
The other I pondered was related to penalties taken late in the overtime.  As it is right now there really isn’t much to dissuade players from taking penalties very late in the overtime, especially if they are facing any kind of pressure in the defensive zone from the opposition.  If there is only 10 seconds left in overtime the risk/reward equation of taking a penalty to take away even a mediocre chance to score by the opposition probably leans towards taking the penalty.  There have been two such cases where penalties have been called with 16 or fewer seconds left in overtime this year.  I generally don’t like it when it is actually beneficial for a team to take a penalty (one of the reasons I don’t particularly like basketball) so a minor rule tweak that might be worth considering is that all overtime power plays must be served for at least one minute unless the power play gets cancelled out by an offsetting penalty against the team with the man advantage.  So, for example, if someone takes a hooking penalty late in the overtime, the overtime period will be extended until that player has served 1 minute of his penalty.  So if a penalty was called at 4:45 of the overtime, the overtime would extend until 5:45 have been played or a goal was scored or the team with the man advantage took an offsetting penalty, whichever occurs first.  This would also reduce the number of games that go to a shootout.  I don’t necessarily see this as being inplemented, but it is an interesting concept nonetheless (and I suppose something similar could be implemented for the end of regulation time in one goal games).