This is nothing new. In the last few years many sports are increasingly leaning on research in statistics, and operations research to help in decision making. One of the leaders of this field is Wayne Winston. Wayne Winston, the maintainer of the mathletics blog is also the author of the book Mathletics: How Gamblers, Managers, and Sports Enthusiasts Use Mathematics in Baseball, Basketball, and Football.
The first chapter of the book describes Baseball’s “Pythagorean Theorem”. The Baseball Pythagorean Theorem was originally created by the the father of sabermetrics Bill James. In a Baseball game you need to score runs to win, and conversely prevent the opposing team from scoring runs. Bill James observed that the proportion of wins by a team is well approximated by the formula,
The formula has some good properties:
- Predicted proportion is always between 0 and 1.
- Increasing the number of runs scored, increases the predicted proportion of wins.
- Decreasing the number of runs allowed, also increases the predicted proportion of wins.
The exponent, 2, is found by fitting the data to the formula (more on this in later posts). Winston actually calculated the optimum exponent for his set of data was 1.9, very close the 2 found by Bill James.
The formula as it is does not work well with NFL or NBA results. Winston refitted the formula and found an optimum exponent of 2.7 for the 2005-2007 NFL seasons, and 15.4 for the 2004 – 2007 NBA seasons.
As a Canadian though I am obligated to see how well this formula applies to hockey, and what the optimal exponent would be. So as a major series of posts expect me to,
- Write a “spider” to extract hockey statistics from online hockey stat databases.
- Use R to find the optimal exponent.
- Analyze how the exponent changes in different periods, and different hockey leagues.