This past weekend I attended and spoke at the Rochester Institute of Technology Hockey Analytics Conference and because it was such a great conference I wanted to write up some of my thoughts on the event.
First up was a panel discussion on the State of Hockey Analytics with Timo Seppa, Sam Ventura, Andrew Thomas and Matt Pfeffer. Three of these guys now work with NHL teams (Ventura with Penguins, Thomas with Wild and Pfeffer with Canadiens) but they didn’t divulge very much information about the inner workings of their respective organizations. A number of topics were discussed which were informative and served as a good introduction to the day but the more interesting talks came later.
The first talk of the day was from Brad Stenger discussing how stress and injuries impact athlete performance. This wasn’t specifically about hockey but a little more generically about how stress (workload, travel, etc.) can impact athlete performance. This is a topic that we haven’t looked into all that much within the hockey analytics community and more could certainly be done. There has been some investigation into playing goalies in back to back games and how that affects performance but I don’t think much more has been done with respect to how things like travel schedule, ice time, etc. impact performance. There also hasn’t been a lot of work done on whether some teams are better at preventing injuries than others. One interesting stat that Stenger brought up is that hockey players are far more likely to suffer injuries in practice than players in other sports do.
Next up was Kyle Stitch on player return on investment. What Stitch attempted to do was apply a financial market model to hockey statistics. In the financial markets Beta is a measure of a stocks volatility and Stitch was looking to apply a Beta model to players to evaluate the players statistical volatility to measure player consistency. Although I am always skeptical of throwing hockey data into some model and hoping the results come out nicely I also find it incredibly interesting and valuable when people of different backgrounds bring different ideas and concepts to the table.
Matthew Coller was up next up talking about his experience on how he began using hockey analytics as a member of the media starting out as just the “numbers guy” and developed into a “player evaluation” guy. He talked about how he used statistics to show that Brad Boyes was misused in his time with the Sabres and gave some tips to everyone on how to write and express their opinions better. Most importantly, keep your message simple, clearly state your main takeaways of what you are writing about, and write in “AP Style” because everyone knows and is familiar with that.
After the morning break Alexandra Mandrycky talked about how the Salary Cap rules can impact roster decisions using the 2014-15 Los Angeles Kings as a case study. In the first half of the season the Kings were up against the cap and also had to deal with injuries and suspensions that had differing impact on their cap and roster situation. Mandrycky clearly outlined what the Kings could and did do each time they came up against a new injury or suspension. It was an interesting look at some of the nuances of the CBA and the salary cap rules.
Next up was Carolyn Wilke who looked more at player valuation rather than evaluation and what the optimal amount each team should pay a player based on their position and role. While we all look at contract signings and then look at their performance and in a mostly adhoc way form an opinion on whether the contract is good or bad, Wilke formalized this process and in turn showed, as expected, that teams that had more average or better contracts had better results than teams that had more poor contracts.
Next up was myself and I discussed how player evaluations metrics should remain persistent even when players change teams and through coaching changes because player talent is fairly persistent from year to year. I plan to write up a more detailed post on this at some point but basically I showed that raw Corsi statistics (CF60, CA60, CF%) are not very persistent when players change teams but that their respective Rel or RelTM stats maintain more persistence when players change teams. This would indicate that the raw metrics are not good player evaluation metrics but that Rel and RelTM statistics are potentially significantly better and should be used over the raw statistics. With that said, I did show that coaching changes can have a significant impact on some players Rel statistics so there are other factors at play that Rel is still not accounting for. Another takeaway from my talk is that defensive statistics (CA60, CA60Rel and CA60RelTM) are significantly less persistent than their offensive counter parts indicating we are still very poor at evaluating defensive play.
The entertaining, and clearly excited about hockey math, Micah Blake McCurdy was up next talking about the impact zone starts have on player statistics. Despite the fact that he accused me of essentially “kicking puppies” with how I do my zone start adjustment (removing the 10 seconds after a zone face off) I was happy to see that his far more detailed, robust, and likely very time consuming methodology to account for shift start location essentially backed up what I discovered with about a half day of work. The takeaway is that where you start your shift doesn’t matter that much as only a handful see their CF% adjusted by more than 2 percentage points (i.e. 47% to 49% CF%) and only about 5% of all players who played 100 minutes or more saw their CF% adjusted by more than 1 percentage point. That means when you account for where a player starts their shift 95% of players have their adjusted CF% being within 1 percentage point of their CF%.
After the lunch break Jen Lute Costella had a lengthy talk on the tracking project she has been working on that past number of months. Costella and a team of trackers looked at every goal that (if I recall correctly) about 100 NHL players have scored over the past 3 years and tracked the events that led to that being scored from passes to length of time the puck was possessed by the team and the goal scorer. From that data Costella attempted to categorize the goal scorers as all-round goal scorers, do-it-your-selfers, and finishers. The all-round players can score goals in a number of different ways, the do-it-your-selfers score goals largely on their own skill, and the finishers are more the guys who finish the play with one-timers and chipping in rebounds. This project is still mostly a work in progress but Costella has collected a lot of data that has the potential to discover some interesting results as she digs into the data more.
Next up was Nick Mercadente with presenting a technique of using adjusted save percentage in goalie analysis. Evaluating goalies is probably the most difficult task in hockey analytics and Mercadente attempted to improve this by looking at an adjusted save percentage where shots are valued more based on whether they were a scoring chance or not (based on war-on-ice scoring chance data which I have shown to provide some additional value but still falls short in accounting for all aspects of shot quality). This was an entertaining talk showing an incremental improvement in how we evaluate goalies.
Jessica Schmidt was up next talking about her work tracking defensive zone events. Schmidt has spent a lot of time tracking events for Philadelphia Flyers games and she discussed what she tracked and some of her observations. Apparently Luke Schenn is actually not bad at getting the puck out of the defensive zone, even if he isn’t very good at very much else. I am going to take the time here to also mention that game tracking is a time consuming process that requires dedication, focus, attention to details and also takes a lot of time so everyone who tracks events deserves all our thanks and support.
An arch nemesis to many but otherwise a great guy Stephen Burtch was up next with an interesting application of Network Analysis theory using Ryan Stimson’s pass tracking data. Burtch took Stimson’s passing data and using network theory attempted to determine which players were most integral to the teams success. Burtch looked at two teams, the Maple Leafs and the Islanders and the analysis showed that the Leafs were far more dependent on one or two players while the Islanders were a much better and much more well balanced team. I think some refinements to the model are required (in particular integrating an ice time component which Burtch agreed was important) but it was an interesting application of network theory.
The final segment of the day started off with a talk by Jack Han on IT and Knowledge Management. Han talked about some of the tools he is using with his work with the McGill Martlet hockey team. While Han discussed the specific tools he uses I think the most important takeaway from this talk is that we must all remember that hockey analytics is at its core dependent on information management and finding ways to make that information accessible to those that need it when they need it (which is not necessarily when you have created it) is of utmost importance. It is great that you have come up with some important information but if the coach or general manager is too busy with other urgent issues at the time you came up with it, it won’t get looked at. Furthermore, if the coach and/or general manager isn’t able to go back and easily find that information on a Sunday evening 3 weeks from now when they have some free time your work won’t get considered. In short, the easier and more conveniently we make information available the more likely it will get used.
Stefan Wolejszo was up next talking about a fully integrated hockey analytics department. Wolejszo discussed how one should not focus solely on quantitative analysis or solely on qualitative analysis but to fully integrate both into the process. There is a lot of information outside of the traditional quantitative analysis that we do that has value and should be considered in the decision making process. The challenge is how to best integrate all the information we have into a decision making process.
The final talk of the day was from Michael Boutros who attempted to answer the question of whether chemistry exists in the NHL between players from the same country. Boutros attempted to answer this question by comparing team diversity to team success by using a “Fractionality” model which is an old model used in economics. Long story short, Boutros found that teams that have less diversity of birthplace do in fact perform better. The impact found was statistically significant and while it was not a huge impact each year it was large enough to bump at least one team into the playoffs that was not in the playoffs. Once again, I always find it interesting when models from other disciplines are applied to hockey though I always remain skeptical of how much we should trust the results due to the fact that hockey is a highly complex sport with a large number of interdependent factors at play. Regardless, it was a very interesting idea and the model could be used to investigate other questions such as using languages spoken instead of place of birth or what I might find interesting, age/experience of the players. Is having mostly players of the same age/experience better or worse than having a diverse age group with a mix of youthful enthusiasm and veteran experience and knowledge?
And that was the RIT Hockey Analytics Conference in a nutshell, or at least how I viewed it. Overall I have to say it was a very good conference with a large diversity of topics and speakers. If you didn’t find something of interest throughout the day Hockey Analytics is probably not your thing. Thank-you to all the speakers for making it an especially interesting day and thank-you to Ryan Stimson and Matthew Hoffman for organizing the event. It was a great day and I know for myself it inspired me to investigate a few of the ideas that came up throughout the day.
Note: I wrote this up just from memory so if I got any details wrong feel free to post a comment or send me an e-mail and I will make the correction. Also, if you have written a summary of the conference of have written articles about what you spoke about at the conference send them along and I’ll include a link here, or just post them in the comments.