03-19-2020, 06:13 PM
(This post was last modified: 03-19-2020, 06:18 PM by iStegosauruz.)
[div align=\\\"center\\\"]Background and Methodology –
It’s a question as old as time: “How good are/were we anyway?”[/div]
In any form of competition, the primary goal is winning. It’s the outcome we train for, plan for, and hope for. How we judge seasons is entirely about winning. In 2018, the Alabama Crimson Tide were on a historic pace in the college football regular season. In the middle of November, they could claim the second highest scoring margin since 1998 – 37.2 points per game– which was only 4.6 points per game behind Florida State’s historic 2013 season. That season, however, Florida State played the 62nd strongest schedule in the county. In 2018, Alabama played the 44th.
We don’t judge the 2018 Crimson Tide as the best college football team in history though. We don’t even include them in the conversation after they suffered a historic 44 to 16 upset against the Clemson Tigers.
We judge teams by wins and losses, but for those teams that aren’t championship contenders every year we’re always left asking just how good a team really was. There are a lot of factors we can look at – from yards per game to scoring margin – but none of that captures how good a team truly is. With this in mind, statisticians have turned to advanced metrics to try to judge how good a team was. Bill James famously created a metric known as "Pythagorean Wins" and applied it to baseball. This formula was later revised by now Houston Rockets General Manager Daryl Morey to be applicable to other sports, namely football.
Pythagorean wins is a fairly simple metric that looks to point differential to determine the amount of wins a team was expected to win in a given season. It’s a fairly simple equation that looks like this:
[div align=\\\"center\\\"]PYTH = ((PF^X) / ((PF^X + PA^X))) *13
Where X is calculated by:
X = 1.5 * LOG((PF + PA) / 13)[/div]
Baseball has also attempted to improve on Pythagorean Wins by formulating a new "Linear Formula" to predict a team’s winning percentage. This formula uses a linear regression to assess the expected winning percentage of a team based off of their margins of victory. It is very similar to Pythagorean Wins. When they were tested side by side, they were both found to be almost equivalently accurate methods of predicting a team’s winning percentage or expected number of wins.
There is some pretty complex math behind the “Linear Formula” as it pertains to baseball and its transition to applying to football. You can check that out if you’re interested, but for the sake of brevity and to be as clear as possible the equation that I used looks like this:
[div align=\\\"center\\\"]EXP(W%) = 0.001538 * (PF – PA) + 0.50[/div]
Two notes:
1. This formula was derived based off NFL statistics from 2002 until 2012. For this reason, there is probably some deviation on its application in the NFL and its application as it applies to the sim engine.
2. Although it is derived from NFL statistics and not sim engine statistics you can still fairly accurately determine what margin a team needs to increase their point margin by to gain an extra 1% on their expected winning percentage. Take 0.01 – equivalent to 1% - and divide it by the leading coefficient in the equation – 0.001538 – and you’ll find that a team needs an increase in 6.5 points in their point margin to gain an extra 1% to their expected winning percentage.
[div align=\\\"center\\\"]I’m lost and why do I care? [/div]
I took these two formulas and applied them to the NSFL to track how many wins a team was expected to get in a season and then looked at that amount in comparison to the number of wins they actually achieved. Then I applied the same logic to the current season to try to project how the league will look at the end of the regular season. These numbers will hopefully help you determine how a good a really is and give you a picture of who underachieved and who overachieved during a given season.
[div align=\\\"center\\\"]Number Crunching - [/div]
[div align=\\\"center\\\"]Season 20
[/div]
In Season 20 the Yellowknife Wraiths were alone at the top of the table with the most wins. Based off Pythagorean Wins, however, they weren’t the best team or even the second-best team in the league. Pythagorean Wins predicted that, on average, the eventual Utlimus Champion Austin Copperheads were the best team in the league with an estimated 8.816 expected wins.
The “Linear Formula” projects very similar results. The Wraiths had a 69% winning percentage on the year; however, the regression predicts that, on average, they would have won around 59% of their games. The Copperheads and Otters are pegged as underachievers under this model as well, with the regression predicting that, on average, they would have won around 64% of their games.
Both models showcase that, on average, the Sabercats were the worst team in the league last year based on the number of games they were expected to win, on average, during the season. They ended the year with 5 wins while the Pythagorean model pegged their actual expected win total at 3.205. Their actual winning percentage on the year was 38% while the “Linear Formula” placed their actual expected winning percentage at around 27%.
[div align=\\\"center\\\"]Season 19
[/div]
Season 19 saw the Orange County Otters, the eventual Ultimus Champions, lead the league with 9 wins. It is not surprising that the Pythagorean model placed them as the best team in the league, expecting them to win 9.4 games on average. The Austin Copperheads were also big risers under this model. They won 5 games during the season but the model expects that on average they would win 6.802 games. The San Jose Sabercats had the biggest discrepancy between their actual win total and expected win total under the model. They won 4 games on the year but were only expected to win 1.649 games.
The “Linear Formula” once again parallels the Pythagoream model very closely. The Otters lead the league in expected winning percentage under the formula with 68%. The Wraiths who were tied with them on the year in real wins at 9 are pegged as overachievers. The formula only projects that they would win 58% of their games during the season on average. The Sabercats once again drop very heavily – from the 31% of games they won in the real season to the 16% of games they were projected to win on average.
[div align=\\\"center\\\"]Season 18
[/div]
Season 18 is another season that the Otters and Wraiths had strong showings during. The Wraiths won 11 games on the year and the Pythagorean model projects that as a minor overachievement with their expected wins at around 10.124. The Otters won 10 games on the year which is a similar minor overachievement with their expected wins at 9.765 on the year.
The model projects that the Liberty were the biggest underachievers on the year. They won 4 games during the season but were projected to win around 5.669 games on average. The Sabercats were once again the losers under the model. They won 4 games on the year but the model projects that as a major overachievement. Their projected wins based on the model was 2.444.
The “Linear” formula pegs the Wraiths and Outlaws as big over achievers. The Wraiths won 85% of their games during the season but were only projected to win 74% - an 11% drop. The Outlaws won 69% of their games on the season but were only projected to win 60% - a 9% drop. The formula also thinks the Liberty were major underachievers. They only on 31% of their games on the season but the formula thinks they would win around 45% of their games on average.
[div align=\\\"center\\\"]Season 17
[/div]
Season 17 was a fairly balanced year with only one team – the Outlaws – winning 9 games. The Pythagorean model actually projects that their 9 wins were an underachievement, thinking that on average they would win 9.470 games. The model also thinks the Otters – who won 7 games – underachieved heavily on the year, projecting that they would win 8.656 games normally. The Yeti are the biggest overachieves on the year. They won 5 games while the model only projected that they would win 3.143.
The “Linear Formula” thinks that the Liberty were major underachievers on the year. They only won 31% of their games while the formula predicted a 42% winning percentage. The Otters are also considered underachievers – winning 54% of their games while being projected for a 63% winning percentage under the formula. The Yeti are once again the biggest losers. They won 38% of their games on the year while the formula only projected them to win 29%.
[div align=\\\"center\\\"]Season 16
[/div]
The Season 16 is the first I modeled that saw a team go winless. The Austin Copperheads won no games during the season. The Pythagorean model, however, doesn’t think they were that bad a team, projecting that they should have won 1.462 games on the year. The Butchers and Otters both won 10 games during the season, but the model has vastly different views on them. The Otters project as underachievers with their expected wins being 10.369. The Butchers, however, are major overachievers – one of the biggest overachievers we’ve seen in any season. They were expected to win 7.897 games – 2.10 games less than they actually won.
The “Linear Formula” once again parallels the Pythagorean model pretty well. It also thinks that the Copperheads are not a zero-win team, estimating that they should have won 13% of their games on the season. The Butchers are once again pegged as major overachievers. They won 77% of their games on the year while the formula only thinks they would win 58% of their games on average. The formula also similarly doesn’t like the Wraiths who won 77% of their games on the season but are only estimated to win 66% of their games on average.
[div align=\\\"center\\\"]So, what about this year?[/div]
Using these stats to predict instead of looking back on is a bit different than what we’ve been looking at. They still function the same way – taking the ratio of points for and points against and applying it to winning percentage. The difference comes from the fact that to make predictions they project that the ratio each team has thus far will remain static throughout the season.
With that in mind here is what the model and formula expect respectively:
[div align=\\\"center\\\"]
[/div]
Both the model and the formula think that the Second Line are one of the strongest teams in the league. The model projects that they win 10.062 games on the year while the formula predicts a 62% winning percentage which is about 8 wins. The Otters are far and away the second-best team under the model – projected to win 9.842 games. The formula thinks they’re about equivalent to the Second Line and thinks they will also win 62% of their games.
The Outlaws are the expected to be the worst team under both the model and the formula. The model thinks they’ll win 3.802 games this year while the formula thinks they’ll win 41% of their games which is about 5.5 wins.
Both the model and the formula can also be used to weigh who has overachieved thus far on the year and who has overachieved. This can be especially useful at separating the teams that are tied with 2-3 and 1-4 records.
[div align=\\\"center\\\"]
[/div]
There are currently three teams who are 2-3 on the year – the Butchers, the Yeti, and the Wraiths. Of those teams the model thinks that the Butchers have underachieved the most, predicting that they should have won 2.838 games thus far this season. The Wraiths and Yeti are both seen as overachievers in the model, with it predicting that they each should have won 1.726 games and 1.972 games on the season respectively.
The formula also thinks that the Butchers are the best 2-3 team. Their linear win percentage sits at 52% - 12% higher than the 40% of games they have won this season. The Wraiths are projected as the weakest 2-3 team under the formula. It predicts they should have a 45% winning percentage thus far – a 5% increase to the Yeti’s 7% and Bucher’s 12% increase.
There are currently three teams who are 1-4 on the year – the Outlaws, the Liberty, and the Sabercats. Of those the Outlaws are predicted to be the worst team thus far as they are only estimated as underachieving their potential by 0.20 wins to the Liberty’s 0.47 and Sabercats 0.67. The formula looks at the situation similarly, projecting the lowest difference in linear winning percentage when compared to real winning percentage – 21% - among the 1-4 teams. The Liberty and Sabercats are estimated to have a 23% and 26% difference in linear winning percentage to real winning percentage in comparison.
[div align=\\\"center\\\"]Conclusions – [/div]
1. Every season some teams overachieve and some teams underachieve. This is due to a variety of factors.
2. Pythagorean Wins and the “Linear Formula” are good ways to predict which teams underachieve and which teams overachieve.
3. Pythagorean Wins and the “Linear Formula” can be used to predict how many wins a team will achieve on a given season.
4. Both metrics think the New Orleans Second Line are the best team in the league. The Orange County Otters are a close second.
[div align=\\\"center\\\"]Notes – [/div]
1. As always you can check my work here.
2. I’ll try to provide updates on how the projections change throughout the season. I think that will be an interesting project.
It’s a question as old as time: “How good are/were we anyway?”[/div]
In any form of competition, the primary goal is winning. It’s the outcome we train for, plan for, and hope for. How we judge seasons is entirely about winning. In 2018, the Alabama Crimson Tide were on a historic pace in the college football regular season. In the middle of November, they could claim the second highest scoring margin since 1998 – 37.2 points per game– which was only 4.6 points per game behind Florida State’s historic 2013 season. That season, however, Florida State played the 62nd strongest schedule in the county. In 2018, Alabama played the 44th.
We don’t judge the 2018 Crimson Tide as the best college football team in history though. We don’t even include them in the conversation after they suffered a historic 44 to 16 upset against the Clemson Tigers.
We judge teams by wins and losses, but for those teams that aren’t championship contenders every year we’re always left asking just how good a team really was. There are a lot of factors we can look at – from yards per game to scoring margin – but none of that captures how good a team truly is. With this in mind, statisticians have turned to advanced metrics to try to judge how good a team was. Bill James famously created a metric known as "Pythagorean Wins" and applied it to baseball. This formula was later revised by now Houston Rockets General Manager Daryl Morey to be applicable to other sports, namely football.
Pythagorean wins is a fairly simple metric that looks to point differential to determine the amount of wins a team was expected to win in a given season. It’s a fairly simple equation that looks like this:
[div align=\\\"center\\\"]PYTH = ((PF^X) / ((PF^X + PA^X))) *13
Where X is calculated by:
X = 1.5 * LOG((PF + PA) / 13)[/div]
Baseball has also attempted to improve on Pythagorean Wins by formulating a new "Linear Formula" to predict a team’s winning percentage. This formula uses a linear regression to assess the expected winning percentage of a team based off of their margins of victory. It is very similar to Pythagorean Wins. When they were tested side by side, they were both found to be almost equivalently accurate methods of predicting a team’s winning percentage or expected number of wins.
There is some pretty complex math behind the “Linear Formula” as it pertains to baseball and its transition to applying to football. You can check that out if you’re interested, but for the sake of brevity and to be as clear as possible the equation that I used looks like this:
[div align=\\\"center\\\"]EXP(W%) = 0.001538 * (PF – PA) + 0.50[/div]
Two notes:
1. This formula was derived based off NFL statistics from 2002 until 2012. For this reason, there is probably some deviation on its application in the NFL and its application as it applies to the sim engine.
2. Although it is derived from NFL statistics and not sim engine statistics you can still fairly accurately determine what margin a team needs to increase their point margin by to gain an extra 1% on their expected winning percentage. Take 0.01 – equivalent to 1% - and divide it by the leading coefficient in the equation – 0.001538 – and you’ll find that a team needs an increase in 6.5 points in their point margin to gain an extra 1% to their expected winning percentage.
[div align=\\\"center\\\"]I’m lost and why do I care? [/div]
I took these two formulas and applied them to the NSFL to track how many wins a team was expected to get in a season and then looked at that amount in comparison to the number of wins they actually achieved. Then I applied the same logic to the current season to try to project how the league will look at the end of the regular season. These numbers will hopefully help you determine how a good a really is and give you a picture of who underachieved and who overachieved during a given season.
[div align=\\\"center\\\"]Number Crunching - [/div]
[div align=\\\"center\\\"]Season 20
[/div]
In Season 20 the Yellowknife Wraiths were alone at the top of the table with the most wins. Based off Pythagorean Wins, however, they weren’t the best team or even the second-best team in the league. Pythagorean Wins predicted that, on average, the eventual Utlimus Champion Austin Copperheads were the best team in the league with an estimated 8.816 expected wins.
The “Linear Formula” projects very similar results. The Wraiths had a 69% winning percentage on the year; however, the regression predicts that, on average, they would have won around 59% of their games. The Copperheads and Otters are pegged as underachievers under this model as well, with the regression predicting that, on average, they would have won around 64% of their games.
Both models showcase that, on average, the Sabercats were the worst team in the league last year based on the number of games they were expected to win, on average, during the season. They ended the year with 5 wins while the Pythagorean model pegged their actual expected win total at 3.205. Their actual winning percentage on the year was 38% while the “Linear Formula” placed their actual expected winning percentage at around 27%.
[div align=\\\"center\\\"]Season 19
[/div]
Season 19 saw the Orange County Otters, the eventual Ultimus Champions, lead the league with 9 wins. It is not surprising that the Pythagorean model placed them as the best team in the league, expecting them to win 9.4 games on average. The Austin Copperheads were also big risers under this model. They won 5 games during the season but the model expects that on average they would win 6.802 games. The San Jose Sabercats had the biggest discrepancy between their actual win total and expected win total under the model. They won 4 games on the year but were only expected to win 1.649 games.
The “Linear Formula” once again parallels the Pythagoream model very closely. The Otters lead the league in expected winning percentage under the formula with 68%. The Wraiths who were tied with them on the year in real wins at 9 are pegged as overachievers. The formula only projects that they would win 58% of their games during the season on average. The Sabercats once again drop very heavily – from the 31% of games they won in the real season to the 16% of games they were projected to win on average.
[div align=\\\"center\\\"]Season 18
[/div]
Season 18 is another season that the Otters and Wraiths had strong showings during. The Wraiths won 11 games on the year and the Pythagorean model projects that as a minor overachievement with their expected wins at around 10.124. The Otters won 10 games on the year which is a similar minor overachievement with their expected wins at 9.765 on the year.
The model projects that the Liberty were the biggest underachievers on the year. They won 4 games during the season but were projected to win around 5.669 games on average. The Sabercats were once again the losers under the model. They won 4 games on the year but the model projects that as a major overachievement. Their projected wins based on the model was 2.444.
The “Linear” formula pegs the Wraiths and Outlaws as big over achievers. The Wraiths won 85% of their games during the season but were only projected to win 74% - an 11% drop. The Outlaws won 69% of their games on the season but were only projected to win 60% - a 9% drop. The formula also thinks the Liberty were major underachievers. They only on 31% of their games on the season but the formula thinks they would win around 45% of their games on average.
[div align=\\\"center\\\"]Season 17
[/div]
Season 17 was a fairly balanced year with only one team – the Outlaws – winning 9 games. The Pythagorean model actually projects that their 9 wins were an underachievement, thinking that on average they would win 9.470 games. The model also thinks the Otters – who won 7 games – underachieved heavily on the year, projecting that they would win 8.656 games normally. The Yeti are the biggest overachieves on the year. They won 5 games while the model only projected that they would win 3.143.
The “Linear Formula” thinks that the Liberty were major underachievers on the year. They only won 31% of their games while the formula predicted a 42% winning percentage. The Otters are also considered underachievers – winning 54% of their games while being projected for a 63% winning percentage under the formula. The Yeti are once again the biggest losers. They won 38% of their games on the year while the formula only projected them to win 29%.
[div align=\\\"center\\\"]Season 16
[/div]
The Season 16 is the first I modeled that saw a team go winless. The Austin Copperheads won no games during the season. The Pythagorean model, however, doesn’t think they were that bad a team, projecting that they should have won 1.462 games on the year. The Butchers and Otters both won 10 games during the season, but the model has vastly different views on them. The Otters project as underachievers with their expected wins being 10.369. The Butchers, however, are major overachievers – one of the biggest overachievers we’ve seen in any season. They were expected to win 7.897 games – 2.10 games less than they actually won.
The “Linear Formula” once again parallels the Pythagorean model pretty well. It also thinks that the Copperheads are not a zero-win team, estimating that they should have won 13% of their games on the season. The Butchers are once again pegged as major overachievers. They won 77% of their games on the year while the formula only thinks they would win 58% of their games on average. The formula also similarly doesn’t like the Wraiths who won 77% of their games on the season but are only estimated to win 66% of their games on average.
[div align=\\\"center\\\"]So, what about this year?[/div]
Using these stats to predict instead of looking back on is a bit different than what we’ve been looking at. They still function the same way – taking the ratio of points for and points against and applying it to winning percentage. The difference comes from the fact that to make predictions they project that the ratio each team has thus far will remain static throughout the season.
With that in mind here is what the model and formula expect respectively:
[div align=\\\"center\\\"]
[/div]
Both the model and the formula think that the Second Line are one of the strongest teams in the league. The model projects that they win 10.062 games on the year while the formula predicts a 62% winning percentage which is about 8 wins. The Otters are far and away the second-best team under the model – projected to win 9.842 games. The formula thinks they’re about equivalent to the Second Line and thinks they will also win 62% of their games.
The Outlaws are the expected to be the worst team under both the model and the formula. The model thinks they’ll win 3.802 games this year while the formula thinks they’ll win 41% of their games which is about 5.5 wins.
Both the model and the formula can also be used to weigh who has overachieved thus far on the year and who has overachieved. This can be especially useful at separating the teams that are tied with 2-3 and 1-4 records.
[div align=\\\"center\\\"]
[/div]
There are currently three teams who are 2-3 on the year – the Butchers, the Yeti, and the Wraiths. Of those teams the model thinks that the Butchers have underachieved the most, predicting that they should have won 2.838 games thus far this season. The Wraiths and Yeti are both seen as overachievers in the model, with it predicting that they each should have won 1.726 games and 1.972 games on the season respectively.
The formula also thinks that the Butchers are the best 2-3 team. Their linear win percentage sits at 52% - 12% higher than the 40% of games they have won this season. The Wraiths are projected as the weakest 2-3 team under the formula. It predicts they should have a 45% winning percentage thus far – a 5% increase to the Yeti’s 7% and Bucher’s 12% increase.
There are currently three teams who are 1-4 on the year – the Outlaws, the Liberty, and the Sabercats. Of those the Outlaws are predicted to be the worst team thus far as they are only estimated as underachieving their potential by 0.20 wins to the Liberty’s 0.47 and Sabercats 0.67. The formula looks at the situation similarly, projecting the lowest difference in linear winning percentage when compared to real winning percentage – 21% - among the 1-4 teams. The Liberty and Sabercats are estimated to have a 23% and 26% difference in linear winning percentage to real winning percentage in comparison.
[div align=\\\"center\\\"]Conclusions – [/div]
1. Every season some teams overachieve and some teams underachieve. This is due to a variety of factors.
2. Pythagorean Wins and the “Linear Formula” are good ways to predict which teams underachieve and which teams overachieve.
3. Pythagorean Wins and the “Linear Formula” can be used to predict how many wins a team will achieve on a given season.
4. Both metrics think the New Orleans Second Line are the best team in the league. The Orange County Otters are a close second.
[div align=\\\"center\\\"]Notes – [/div]
1. As always you can check my work here.
2. I’ll try to provide updates on how the projections change throughout the season. I think that will be an interesting project.