What We Can Predict, and What We Can’t

A couple of pieces in the past few weeks have me thinking. First, there was Matt Rudnitsky’s MONEYPUCK piece over at Sportsgrid: though I was pleased to see my work quoted warmly in the article, it mostly focused withering criticism on stats-centric hockey writers picking the Kings to beat Anaheim in the second round. The thrust of Matt’s argument is that the outcomes of single games and short series are far too random to be successfully predicted by a simplistic comparison of Fenwick-For percentages. Fenwick Close can tell you that the Kings are a better puck possession team than Anaheim, but in making those numbers the basis of a series prediction, you’re effectively declaring that a lot of other information doesn’t matter. This was followed by a post from David Johnson at Hockey Analysis, which points out that team forecasts based on Corsi/Fenwick differential have had their share of misses (e.g., the 2012-13 and 2013-14 Devils, the 2013-14 Senators) alongside successes like this season’s Toronto Maple Leafs. Much like the MONEYPUCK piece, David doesn’t dispute the value of possession-based analytics, but argues that useful information is being set aside by ignoring the goal-scoring component.

Photo credit: Flickr user Reg Natarajan. Use of this image does not imply endorsement

Over the past year or so, I’ve obviously done a lot of thinking about predicting hockey outcomes, and I do think there’s merit to what Matt and Dave are saying. Hockey analytics appear to have taken a leap into the mainstream in 2013-14, but with the added attention, greater scrutiny is almost certain to follow. And anyone with a healthy skepticism towards shot-based analytics can find reasons to mistrust them unless they’re presented carefully. As vindicating as it might feel to be right about the Maple Leafs, many of us (myself included) predicted big things from teams like New Jersey, Ottawa, and the Islanders. While three of the four Conference Finalists in this year’s playoffs were top-10 possession teams, the fourth team, Montreal, was one of the league’s worst, and two of the top 10 (New Jersey and Vancouver) didn’t even make the postseason. In the 2012-13 season, many analysts were fond of depicting the Anaheim Ducks as a PDO-driven paper tiger that rode high percentages to success over the short schedule, with the implication that, like Toronto, they would crash to earth in 2013-14. Yet not only did Anaheim not regress, their even-strength shooting percentage and PDO were actually higher over the longer schedule.

The point of this isn’t to call out anyone over predictions they’ve made (I’m more than happy to admit making loads of bad predictions this season), but to point out some features of hockey forecasting that make it unwise to oversell the value of Fenwick and Corsi. For one thing, puck possession is correlated with winning, but it’s not a necessary or sufficient condition for it. That is, it’s a correlate of winning, but not a determining cause; regularly controlling a greater share of shots helps to push the math of Goals-For and Goals-Against in a team’s favor, but it doesn’t necessarily translate into a favorable goal differential or standings points. As a feature of teams’ games that’s at least somewhat within their control, possession is more consistently associated with future win probabilities than PDO, but Sh% and Sv% are much stronger explanatory factors in past results than Fenwick numbers. What this suggests is that the random chance, player talent, and other factors underlying goal-scoring are a critical piece of the causality behind hockey outcomes. Matt was ultimately proven correct in noting that the Anaheim-Los Angeles winner would be determined by more than just possession differential. In staking an early 2-0 lead against the Ducks, LA controlled less than 48% of the even-strength Corsi but had a 1.046 PDO. Over the next three games, the Kings’ share of possession jumped to 64%, but their PDO fell to 0.914, and they lost all three. In the two elimination games they faced against Anaheim, the possession was basically even (50.3% in favor of LA), but the Kings’ PDO was 1.088. Unsurprisingly, they won both games and advanced.

This brings us to an important feature of statistical regression: regression is an empirical regularity, but it doesn’t necessarily occur within a specific time frame. When you predict the collapse of a team like the 2013-14 Leafs or the 2011-12 Minnesota Wild, you’re taking a risk when it comes to the timing. For every team like those two or the 2010-11 Avalanche, there’s a squad like the 2013-14 Avs or the 2011-12 Predators (who started the playoffs on home ice despite the second-worst Fenwick Close in the NHL), for whom the offsetting bad luck never materializes over 82 games. The question of timing is especially pertinent when it comes to understanding playoff series. As an empirical regularity, teams with strong possession numbers tend to do well in the NHL postseason, but it doesn’t follow that strong puck control causes the playoff success. There’s no evidence that possession differential is a consistent predictor of victories in single games, and for every strong possession team that’s gone on to Stanley Cup glory (e.g., the 2007-08 Red Wings, the 2009-10 Blackhawks), there are teams like the 2011-12 Red Wings, the 2011-12 Penguins, and the 2008-09 Sharks, who failed to advance past the first round despite superlative possession numbers. More to the point, all three of these teams had dominating possession numbers in the first-round series they lost. And, of course, there are teams like the 2007-08 Penguins and the 2009-10 Canadiens, who won multiple playoff rounds despite awful possession play in the postseason.

The reason playoff series are so difficult to predict is that fluctuations in luck can happen at any moment, and the team that experiences the fewest dry spells in April, May and June is the one that ends up hoisting the big trophy. Sometimes, luck comes in the form of match-ups. The Red Wings won the Cup in 2008 with a PDO under 1.000; while that’s incredibly impressive, it didn’t hurt that they avoided teams like San Jose and the Rangers that year. Similarly, while the Blackhawks were very deserving winners in 2010, they had to be thanking the Montreal Canadiens for preventing a Finals match-up with either Pittsburgh or Washington. More often, though, luck comes in unpredictable shifts in PDO. Some analysts discuss the 2011-12 Kings as a success case for Corsi-based analytics, but let’s be clear: Fenwick Close could have told you that the Kings were better than their record suggested, but anyone who says that they predicted LA’s Cup victory using Fenwick Close is kidding themselves. LA was a deserving champion that beat some great teams, but their postseason PDO in 2012 was 1.022, mostly due to Jonathan Quick’s 0.944 goaltending. Unless you’re utterly abysmal at controlling play, you’ll probably win a lot with a 1.022 PDO. Los Angeles was incredibly fortunate that a goalie most put into the “good but not great” category didn’t have a bad stretch of games for two months. Obviously, 0.944 likely isn’t indicative of Quick’s “true” talent as a goalie, but, again, knowing the variability in his performances doesn’t tell you anything about the timing of the variations. In the current postseason, much has been made of LA’s comebacks from the brink of elimination against San Jose and Anaheim; worth noting, however, is that they’ve shot 12.1% and gotten 0.951 goaltending in their six elimination games. For a point of reference, this 1.072 PDO is higher than Pittsburgh’s PDO during their 15-game winning streak last season. As a counterpoint, consider the famous upset of the Washington Capitals by the Canadiens in 2010. In the course of going up 3-1 in that series, the Caps had a PDO of 1.045; in the final three games, Jaroslav Halak’s resurgence gave Washington a PDO of just 0.936. Whether because of match-ups or low PDO, many, many teams playing strong possession hockey don’t end up winning in the playoffs. For better or worse, what separates the winners from the also-rans is simple puck luck, not possession.

About Nick Emptage

Nicholas Emptage is the blogger behind puckprediction.com. He is an economist by trade and a Sharks fan by choice.
This entry was posted in Analysis Review, Original Analysis and tagged , , , , , , . Bookmark the permalink.

2 Responses to What We Can Predict, and What We Can’t

  1. Pingback: Understanding Possession Effects in Playoff Hockey | Puck Prediction

  2. Pingback: Weekend Hockey Analytics Links: April 25, 2015 | Puck Prediction

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s