To the growing hockey analytics movement, the notion that puck possession is a critical predictor of winning is largely taken as an article of faith. Earlier this year, Chris Boyle published a fascinating infographic illustrating the linkage between strong possession play and playoff success, and some work I’ve done has shown that Fenwick Close differential (i.e., net shots on goal and missed shots in close-game even-strength situations) is a significant predictor of regular-season performance.
Nevertheless, there are inconsistencies in the association between possession and winning that bother me. Across the eight seasons of single-game data I compiled to create the Puck Prediction model, shot differential (which correlates strongly with Fenwick Close) was only predictive of 54.7% of game results. To put this in perspective, you’re more likely to pick the winner of a game correctly by simply choosing the home team than you are by choosing the team with the better shot differential entering the game. (A t-test comparing shot-differential predictions to random predictions is statistically significant, but the correct interpretation of this analysis is less “this relationship is really meaningful” than “a sample of 10,000+ games allows us to estimate a weak association with a high degree of precision”.) What’s more, until the launch of the excellent Extra Skater website this year, game-level Fenwick Close data haven’t been available, so the utility of advanced possession metrics for explaining game outcomes has been unknown. This is important insofar as the adoption of hockey “fancy stats” by teams may well require a demonstrable linkage between specific systems of play and the probability of winning. Which requires us to be able to point to factors within games that explain how teams win.
Fortunately, the proprietor of Extra Skater was kind enough to assemble a dataset for me that included single-game advanced statistics for all 720 games of the 2013 regular season. (Thanks very much, Darryl!) After some manipulation, I calculated single-game Fenwick Close, even-strength shooting and save percentages, and PDO for each team in every game. In order to explore the consistency and predictive utility of each of these measures, I also calculated 1-game, 3-game, 5-game, 10-game, and 20-game “lagged” values prior to each game in the dataset. A one-game lagged Fenwick Close, for example, would be the single-game Fenwick Close from the team’s last game. A three-game lagged Fenwick Close for game i would be the total Fenwick Close for games i – 1, i – 2, and i – 3.
My analysis of the consistency of Fenwick Close looked at the autocorrelation of 1-game, 3-game, 5-game, 10-game, and 20-game Fenwick Close in game i against the same measure in game i – 1. In order to put the correlation coefficients in context, I repeated the analyses for shooting percentages, save percentages, and PDO. The results are below. As you can see, the autocorrelation of Fenwick Close is much stronger than that of shooting and save percentages. Not only is single-game Fenwick Close more consistent than single-game Sh% or Sv%, it has a stronger autocorrelation than 3-game and 5-game moving averages of these variables, in which some of the noise has presumably filtered out. A 20-game moving average of puck possession has an autocorrelation of 0.77, while a similar average for Sh% has an autocorrelation of just 0.41, and a 20-game Sv% has a 0.52 autocorrelation. So, if you need more evidence that puck possession is a much more consistent and repeatable measure of team performance than shooting or goaltending, there you go.
But what about the relationship beween these variables and winning? To look at this, I loaded the data into R, and performed a series of logistic regressions using generalized linear models and standard errors adjusted to reflect the serial correlation in the data. One set of models estimated single-game win probability as a function of 1-game, 3-game, 5-game, 10-game, or 20-game lagged Fenwick Close. A second set included lagged PDO (in a similar fashion) as a second independent variable. Finally, a third set of models replaced PDO with lagged versions of its components. The results are depicted below.
A few points to note here:
- When the number of observations in your data gets into the hundreds, it’s wise to be somewhat skeptical of statistical significance. Any effect can appear significant if you have enough power to throw at the model. For example, while it’s interesting that the prior game’s PDO is a significant predictor of win probability, the tiny effect size makes this a tough result to interpret.
- When looking at the large odds ratios that are frequently associated with Fenwick Close, Sh%, and Sv%, keep in mind that these variables only take values between 0 and 1. As such, the effect might not be as dramatic as it looks. For example, the odds ratio for the 5-game lagged Fenwick Close should be interpreted as follows: a team with a 5-game Fenwick Close of 100% is 17.47 times more likely to win than a team with a Fenwick Close of 0%.
- PDO may be useful as shorthand when discussing luck in team performance, but these results suggest that its use can obscure the predictive value of its component parts.
- The consistency of Fenwick Close may actually work against its utility as a predictor of winning. That is, the autocorrelation results suggest that a team’s Fenwick Close converges to a steady-state value fairly quickly, which implies that it doesn’t vary much from game to game. Unfortunately, in a regression-based analysis, a measure that doesn’t vary much isn’t going to be able to explain variation in the outcome of interest.
- On the other hand, Sh% and Sv% are much less consistent variables, yet their greater variability doesn’t translate to a stronger correlation with win probability. This suggests a weaker underlying connection to wins.
Obviously, the above shouldn’t be viewed as definitive. Between the lack of inter-conference play and other factors, it’s entirely possible that idiosyncratic features of the 2013 season contributed to these results. I plan to update this analysis as game-level data from other NHL seasons become available. Another important limitation: my sample sizes dropped off as the length of the lag in my analysis increased. For example, analyses featuring variables with 10-game lags necessarily excluded the first 10 games of each team’s season.
Still, as someone like myself or Josh Weissbock can tell you, using statistics to predict single-game outcomes is very, very challenging. It’s likely the case that stronger possession play leads to a marginal increase in win probability, and over the course of a long season, this increase translates into additional wins and points. The same, of course, can be said of goaltending and team shooting, but these are less reliable from game to game than controlling the puck. But when it comes to single games, or small numbers of games (i.e., playoff series), no variable, even Fenwick Close, is as predictive as you might expect.