Feb 052012
 

One of my beefs in the analysis and evaluation of hockey players is the notion that PDO (on-ice shooting percentage plus on-ice save percentage) can be used as a proxy for luck.  A perfect example of how PDO is used as a proxy for luck is this article by Neil Greenberg about the Washington Capitals.

For example, when Alex Ovechkin has been on the ice during even strength this season, the team has a shooting percentage of 8.2 percent and has saved shots at a rate of .917. So that makes his PDO value 999 (.082+.917=.999), which is almost exactly the league average. In other words, Ovechkin has seen neither very good nor very bad “puck luck” this season.

What’s useful about this metric is that it’s “unstable,” and over a large-enough sample will regress to 1000. Why 1000? Because every shot that is a goal is a shot not saved, and vice versa.

My beef with such an analysis is the notion that for all players PDO regresses to 1000 and any players with PDO above 1000 are lucky  and any players with a PDO below 1000 are unlucky.  While I do believe luck can influence PDO over small sample sizes, not all players have a natural PDO level of 1000 and there are two reasons why.

1.  Not all players play in front of perfectly average goalies which will have a major impact on the save percentage portion of PDO.

2. Players can drive shooting percentages.

To show you what I mean on point 2, I took 4 years (2007-08 to 2010-11) of 5v5 zone start adjusted data and grouped forwards based on their ice time over those 4 years and then calculated the on-ice shooting and save percentages and PDO for each group.  Here is what I found.

TOI (minutes) SH% SV% PDO
<500 7.5% 90.9% 983.5
500-999 7.9% 91.2% 991.2
1000-1499 8.0% 91.2% 992.2
1500-1999 8.2% 91.2% 993.4
2000-2499 8.6% 91.1% 997.0
2500-2999 9.0% 91.2% 1001.9
3000-3499 9.3% 91.2% 1004.4
3500-4000 9.8% 90.8% 1006.1
4000+ 10.4% 90.8% 1012.4

PDO varies from 983.5 up to 1012.4 depending on the group’s ice time.  This is largely driven by shooting percentage which varies from 7.5% to 10.4% with the players with the lowest amount of ice time having the lowest on-ice shooting percentage and the players with the most ice time having the highest shooting percentage.  Order is the enemy of luck so seeing shooting percentages ordered this nicely tells me something other than luck is happening.  Driving on-ice shooting percentage is a skill.  This means more talented players can have a natural PDO (the PDO that they should regress to) above 1000 and less talented players can have a nautral PDO below 1000.  Factor in the goaltending and a player could have a natural PDO well above or well below 1000.

Now, this is not to say that luck isn’t a factor in a players PDO, especially over small sample sizes, it’s just we can’t estimate that luck by assuming every players natural “regress to” PDO is 1000.  Daniel Sedin has a PDO of 1043 this season (through Thursday February 2nd).  Is it fair to suggest he has been luck and should see his PDO regress to 1000?  When you consider his4-year PDO is 1035 (and his 3 year PDO is 1054) probably not.  His natural, “regress to” PDO is probably not that far off his current 1043 PDO.  Now if you are talking about Todd Bertuzzi this season it’s a different story.  Through Thursday he had a a PDO of 1056 while his 4-year PDO is 994 and he hasn’t had a PDO above 1000 in any of the previous 3 seasons.  It is probably fair to presume that Bertuzzi’s natural regress to PDO is much closer to 1000, maybe even below 1000 in which case it is fair to conclude that Bertuzzi has probably been quite lucky so far this season and is unlikely to continue at this pace the remainder of the season.

When used properly PDO can be an indication of luck but to do so we need to consider the context of a players PDO, not just assume all players PDO’s will necessarily regress to 1000.

 

  16 Responses to “Thoughts on PDO and Luck”

  1.  

    On any list of must-read stories about advanced stats in the NHL, this post should be high on the list.

    Sound and excellent analysis.

  2.  

    No matter how many times it’s been pointed out to you, you still insist on misusing the term “Players can drive shooting percentages.” No one (certainly no one who has any numerical literacy) argues that all players shoot at exactly the same percentage. All you have to do is look up various players stats to see that. What people mean when they argue that players don’t “drive SH%” is that players don’t make *other* players shoot better.

    If you put three career 11% shooters on a line and they all continue to shoot at 11%, no one is driving the shooting percentage, they’re all just performing at their true talent level. Now, if you put three career 11% shooters on a line together and two of them became 15% shooters (over a large enough sample size), then *that* would be evidence of players “driving shooting percentages”.

    You continue to construct strawman arguments to knock down that few if any people actually believe.

    •  

      I have had a number of discussions with people who insist that variation in shooting percentage is largely due to randomness and luck. Prior to the season Gabe Desjardins vehemently denied the ability of 10 chosen NHLers to be able to sustain elevated on-ice shooting percentages and insisted they would regress to the mean. The whole idea that over time PDO regresses to 1000 is based on the wrong belief that shooting percentages are largely random and will regress to a mean (see Neil Greenberg’s post on PDO of the Capitals). The whole basis behind using corsi over goal based stats is that shooting percentages are largely luck driven and not persistent. Maybe I shouldn’t use the terminology “drive shooting percentage” and use “able to sustain an elevated on-ice shooting percentage” instead, but to suggest people readily accept that elevated shooting percentages are sustainable and account for that in their player evaluation methodologies is simply not true. If they did there would be a lot less corsi analysis and PDO is a proxy for luck analysis going on.

  3.  

    David has cause and effect wrong (for the most part). Situations can dictate shooting percentages. Players who play in offensive situations get more high quality shots. Players in defensive situations do not attempt to get high quality shots and thus take low quality ones. And players tend to play similar roles over many years.

    The analogy I have tried to use to explain it to him is wind. Wind causes trees to move. David is arguing that trees moving cause wind. He wastes a lot of time measuring that it is windier near his apple tree than his pear tree thinking that proves his point. All it proves is that his apple tree is in a windier location. It didn’t cause the wind.

    The idea that PDO regresses to 1000 is clear. Obviously that number isn’t exactly one for all players, but it is a very good rule of thumb. Just like the rule of thumb that all the trees in your yard get the same weather.

    The Ovechkin point made by Neil Greenberg is a valid one. He is showing that Ovechkin’s lack of scoring this year is not explainable by bad luck. He has a valid point and PDO shows it well.

    Players do not drive the shooting percentage of their teammates and opponents who are on the ice while they are. At least not to a very significant level.

    Stop wasting your time on this point.

    You found that players who get hardly any ice time are not used in offensive situations and hence do not get high percentage shots, while players who get lots of ice time often do play in a significant number of offensive sitautions. The situation drives the shooting percentage.

    •  

      Your problem is that you are trying to explain why something happens as a reason to ignore that it does happen.

      By that I mean you are suggesting that we should ignore that players have elevated on-ice shooting percentages because they only got elevated on-ice shooting percentages because they tried to get elevated on-ice shooting percentages and we shouldn’t credit players for trying.

      Oh, BTW, the Red Wings try and play a puck control game. That is the style of game the coach asks them to play. Should we ignore their strong corsi ratings because they tried to get a strong corsi rating?

      •  

        I have the same frustrations with the Corsi religion as you do. It’s a very good idea that someone at some point decided to take much further than the concept could bear.

        Of course PDOs shouldn’t regress to 1000. We should suspect this because if they put me out there, I would shoot 0% and probably give a pretty generous save percentage discount too. Being as there aren’t that many hockey players in the world, are we not right to expect that we should see a difference among the top 1000 of them?

        “Play Hockey. With A Little Luck You’ll Be Above Average.”

  4.  

    David you miss the point as usual. It is the circumstance and not the player that drives the shooting percentage. So the only logical thing to do is to not credit the player and look at the circumstance.

    •  

      Again, Detroit coaches stress puck control. Should we credit Detroit players for playing on a team that stresses puck control?

      I understand perfectly what you are saying. I just think it is complete and utter nonsense. Would Travis Moen’s (or even Scott Gomez’s) on-ice shooting percentage jump from bottom of the league to top of the league if his coaches just asked him to play more offensively? When you can prove that to me I’ll accept what you are saying, until then you just look like some idiot arguing that Scott Gomez is as good offensively as Sidney Crosby if only he played under more favourable wind conditions.

  5.  

    Detroit is a non-sequitor. The fact that you use them so often is why you are treated as such a pariah on the internet. You wold be well advised to not argue unrelated facts with questionable if any relation to the topic in question.

    Travis Moen or Scott Gomez would likely have a closer to average team shooting percentage when they are playing on the ice if they were to play a more offensive role. Of course Moen especially would not likely be too successful a player in that role regardless of his team’s shooting percentage.

    Nobody is arguing that scott Gomez and Sidney Crosby are equals offensively – except you so you can argue against it. That is another reason you are treated as such a pariah on the internet. You misrepresent everyone else’s arguments. You would be well advised to stop that.

    •  

      Actually, I think this is the first time I have brought up the Red Wings in about 2 years.

      “Travis Moen or Scott Gomez would likely have a closer to average team shooting percentage when they are playing on the ice if they were to play a more offensive role.”

      ‘Average’ is still a long way from where Sidney Crosby is. Oh, and Gomez generally speaking has played an offensive role on his team.

      “Nobody is arguing that scott Gomez and Sidney Crosby are equals offensively – except you so you can argue against it. ”

      You indirectly are but you won’t admit it. Crosby’s 4 year FenF% is 0.534 and Gomez’s is 0.534. Perfectly identical. Gomez has more OZone starts too so that would seem to indicate over the prior four years that Gomez was given a more offensive role. So how do I make Crosby look better than Gomez using corsi stats?

      You are making a ton of theoretical arguments without any numbers to back them up that don’t apply in practice.

  6.  

    I don’t know if David is right, but I do know Corsi plus/minus — even factoring in ZoneStarts and Qual Comp — is not a particularly adept way to measure the two-way play of hockey players.

    Corsi is an OK proxy for scoring chances, I agree with that notion.

    But even if we know who is out on the ice for scoring chances for and against, if we’re to be fair and accurates to the players we evaluate, it’s crucial to know who was actually involved in creating these scoring chances for and in making mistakes on chances against.

    It’s not right to blame a dman for a scoring chances against if he made no mistake on the play. Nor is it right to give credit to a forward, if he had no impact on creating a scoring chance for this team.

    But that’s typically what is done when people apply Corsi/scoring chance team metrics to individual players. For some reasons it’s assumed that players really deserve all those plus marks and minus marks, and that is solid data, when it’s not really close to that.

    Wingers, for example, have fewer defensive responsibilities, more offensive responsibilities. That means it’s far less likely they will be involved in chances for. They very often deserve those plus marks on scoring chances for.

    But they very often don’t deserve the minus marks on chances against.

    Yet when team Corsi and scoring chance numbers are applied to individual players, the individual gets a plus for all chances for and a minus for all chances against no matter what they did on the play.

    The stat is utterly riddled with false negatives and false positions.

    And, no, it doesn’t always “even out” over a full year of play. Why should it?

    There are cases of good players playing with crappy linemates pretty much all season long, and also cases of bad players playing with good linemates pretty much all season long.

    So while Corsi plus/minus might often get it right on players, it’s the exceptions that really cause problems if you’re relying on it to rate individuals (as many bloggers do and maybe even some NHL GMs do, for all I know).

    Essentially if a winger is out for a lot of shots at net against — with ZoneStarts and QualComp being equal — what his Corsi-minus tells us is mainly whether he’s out with a weak center and defencemen.

    Try making NHL trades based on Corsi, factoring in all the other stats you want. You will make a small number of terrible trades based on Corsi plus/minus, because it’s not a reliable indicator of two-way play.

    So if bloggers won’t to keep using Corsi plus/minus to rate the two-way play of players, they’re free to do so, but don’t expect the majority of the hockey world to embrace these numbers.

    At best, it’s somewhat unfair and inaccurate to apply a team plus/minus number to an indvidual player. It makes me uneasy.

    •  

      I understand where you are coming from David. It is always questionable drawing conclusions about individual players from team stats, which is what makes hockey so much more difficult to analyze statistically than baseball. This is why I like to look at whether a particular player makes his teammates better (and opponents worse) when they are on the ice together compared to when they are apart. If all of Luke Schenn’s defensive partners have poorer “on-ice” defensive stats when they are playing with Schenn than when they are not, then I do think we can draw some conclusions about Schenn’s abilities. If Schenn was a good player he should be able to make his teammates on-ice stats better (and opponents worse) when they are on the ice together.

      “Essentially if a winger is out for a lot of shots at net against — with ZoneStarts and QualComp being equal — what his Corsi-minus tells us is mainly whether he’s out with a weak center and defencemen.”

      Maybe. But it might also mean that he is woefully bad at winning the puck battles in the corners in the offensive zone resulting in the opposition being able to recover more pucks and transition to offense easier and more frequently. I don’t think we can reasonably and fairly assign blame for poor corsi or poor goal stats because there is so much happening on the ice which may not be apparent if we just look at the direct cause of the shot or goal.

  7.  

    Honestly, I have zero idea why people argue that PDO should regress to 1000 for a given team, let alone for an individual player. Definitionally, league wide it must average to 1000 but that doesn’t indicate a reason that for a given player, line combination or team it should. It’s called a distribution because your observations can and will be distributed about your measure of central tendency (mean, median). The fact that you can find a variable that could probably regress and have good fit with PDO likely indicates your error term is rather lower than your explanatory variable. I can run the numbers if you give me the raw data. I tend to doubt that TOI is your explanatory var, it’s confounded with something else but if you have an independent that can account for that much variance your observations aren’t noise there’s just something that’s not been covered.

    •  

      TOI certainly isn’t the explanatory variable but the fact that shooting percentage correlated nicely with TOI tells me that there is something more than luck at play. Luck and randomness won’t result in order like that. TOI correlates with shooting percentage because coaches dole out ice time based on the players doing good things and players who can produce a high shooting percentage are doing good things.

      Send me an e-mail to david (at) hockeyanalysis.com and we can discuss. I can provide you with a bunch of data if you want to dig in and get your hands dirty.

  8.  

    Great points David.

    Crosby is the best example. It is generally agreed that he is the best player in the game. (his salary reflects that).(He also has a sh% reg. well above the norm Instinctively, someone who watches alot of hockey knows Crosby does things better than average player. Unfortunately, we have yet to identify and quantify this skill that he possesses.We should be focusing our efforts on identifying this skill.
    I believe the penguins are one of a few teams working in this area specifically with expected scoring from certain areas on the ice.

    the ‘puck stops here’ ignorantly proclaims that this is because he and
    other top players are in favorable ‘offensive situations’. This is ridiculous – to a star player any situation can be an offensive situation.And the converse is true three travis moens on the power play wont result in average scoring. Thats what top players do – they do ‘create’ better scoring opportunites or in your words push other players %’s.

    I have tried to make the same arguments regarding Vancouvers Cody Hodgson many ‘advanced metric’ writers have said it was a good trade because Vancouver is selling Hodgson high? this based on a 63 game sample, and because of his high PDO.low corsi, low Q of C, high ozone starts etc etc.I like your idea of using at least two seasons for comparisons.

    This is way to soon to say that Hodgson could no tbe a player that has a higher PDO.The truth is we don’t know yet!

    However, the main point philosophically is if there are players who can have higher pdo’s then when a young player comes up that starts to potentially show this skill we should make damn sure not to let him go.

    The irony here is that when you compare Hodgson’s numbers this year to H. Sedin in 2007 (the first year available) they are quite similar.
    (other than Corsi). How do we know that Hodgson might turn out to be as good as H. Sedin?

    One finally point possession analysis is in such infancy.and a little bit of knowledge is a dangerous thing.

    The only correct method to draw the conclusions on Hodgson would be if we had at least 15 years of data where we could compare first year players and their development over time.

  9.  

    I came across this site tonight because I decided I should finally look into these advanced stats I always gloss over on the hockey blogs I read. I’ve always heard that “PDO always regresses to 1000″ and just accepted that as a fact proven by hockey-stat-geeks and didn’t question it. But it seemed counter-intuitive, so I took my first ever look at behind the nets stats and manually checked a few players year-by-year pdo.

    Started with datsyuk. Dude is renowned by stats-guys for his two-way ability so I was curious what his would look like. Low this year, but otherwise it’s always over 1000, and the trend certainly isn’t “regressing to 1000″. It’s always above that. Ditto for Lidstrom. Made sense that it wouldn’t when I thought about it. Maybe no player is capable of affecting save percentage over the course of a large enough sample size, but some guys really do have a better shot than others, and it would seem common sense that a guy who can shoot harder, faster, and more accurate would have more of his shots enter the net than someone who can’t shoot as hard, fast, or accurately. It may not make a significant difference, but over a large enough sample size, that advantage should show up in the stats, even if just slightly.

    I’m a flames fan so naturally I looked at that team too, and sure enough, Iggy (not a dominant player anymore but still having one heck of a shot), is always above 1000. Surprisingly, David Moss, also espoused by stat guys to be a solid possession driver, was consistently well below 1000, save 1 year when he made an unusual jump up. As David mentions here, Sedin doesn’t regress to 1000 either.

    It seems pretty self evident to me then that it’s inaccurate to say PDO regresses to 1000 for every player. That’s the context I usually hear it used in, and I think that’s the context David is arguing isn’t accurate. By definition, it has to regress to be equal to 1000, and it’s an interesting observation that for all players it tends to be close to that over a large enough sample. But a more accurate statement is that for any individual, it tends to regress to A number, which may be different than 1000, and that other number indicates something other than just luck (could be a skill, could be circumstance, etc). And, if a player’s current PDO is significantly out of line with his normal regression point it indicates something as well. It could be luck. But doesn’t have to be. Could be the coach is deploying him differently. Could be that his left arm fell off and none of his shots are going in the net anymore because hey, the dude only has 1 arm now!

    Anyway, I’ve learned from this that when someone says “PDO always regresses to 1000″ what they really mean is by definition its average has to be 1000, and if a player’s current number is significantly different than that, it’s indicative of something unusual happening that is not going to keep happening because historically, all players have leveled off at a number not very far from 1000 (although only because they don’t let me play since I can gaurantee you my PDO would be tremendously lower than 1000 for reasons that have nothing to do with luck).

    Incidentally, I can’t believe the tone of some of these comments. Having never been here before, my objective observation is that several of these commenters sound like fundamentalists who will militantly defend the religious doctrine of “PDO always regresses to 1000″ regardless of what contrary evidence may be presented. You seem to suffer from te same character flaw of requiring absolute certainty even when it doesnt exist that produces the worst extremists in every group, and I’m not sure where the pariah name calling was coming from because quite frankly I was impressed with the civility and tact that David responded to you with.

Sorry, the comment form is closed at this time.