International Simulation Football League
*The Math Behind Containing Opposing Quarterbacks - Printable Version

+- International Simulation Football League (https://forums.sim-football.com)
+-- Forum: Community (https://forums.sim-football.com/forumdisplay.php?fid=5)
+--- Forum: Media (https://forums.sim-football.com/forumdisplay.php?fid=37)
+---- Forum: Graded Articles (https://forums.sim-football.com/forumdisplay.php?fid=38)
+---- Thread: *The Math Behind Containing Opposing Quarterbacks (/showthread.php?tid=20221)

Pages: 1 2


*The Math Behind Containing Opposing Quarterbacks - iStegosauruz - 03-18-2020

[div align=\\\"center\\\"]Background – [/div]
Recently I’ve been hammering out pieces that take a statistical look at a variety of different subjects. I started by looking at "Average Value" so that I could find a metric by which to judge success in the league. At some point I’ll revisit that and clean it up a bit, but it got me interested in the stats behind the simulation. I then simulated 21,599 games to find the value of a human offensive tackle in comparison to a bot offensive tackle. There are still areas I want to explore in that comparison – and areas other people have asked me to examine – but hopefully even the first, cursory look I took will change the way teams approach building their offensive lines. My third piece examined a recent trade between Chicago and Philadelphia. I’m going to rerun the same experiment I did in determining the value of the players Chicago acquired later and revisit it at some point, but one of the secondary aspects of that piece was my beginning attempt at creating a value chart for draft picks in the NSFL draft. As with all of my articles thus far, that is something else on the list of things I plan to explore further.

I put out all of those articles in a two-week span. They took a good deal of effort and my eyes are beginning to hurt from staring at spreadsheets every day. My intention was to slow down for a few days and work on something a bit more long term – an ELO system for the NSFL.

If you’re unfamiliar with the concept of ELO that’s okay – you can scroll the Wikipedia if you’re super interested. Its more of a means to an end in this post - I still haven't completed my ELO system - but its helpful to have some background anyway. Simply explained it is a rating system that’s goal is to calculate the skill of groups participating in zero-sum games. A zero-sum game is a game in which when one person benefits it comes at direct response to another losing utility. The best way to explain this interaction is actually with the first game that ELO was conceptualized around – chess. In chess every move you make either benefits your chances of winning or hampers it. In that way every decision you make either provides you utility by taking some away from your opponent or vice versa.

ELO was applied to chess to rank players. Ideally, players with an equal rating will win an equal number of games against each other across a large enough sample. This rating takes the form of a number. Depending on the outcome of the game that number changes. The player that wins takes a portion of their opponents rating. This means that the loser’s numerical rating decreases while the winner’s increases by a proportional amount.

This rating system has now been applied to a variety of sports and competitions from soccer, to esports, and even to football. Like many of my previous statistical dives I try to start with by using a concept already existent in football. For example, when I examined the trade between Chicago and Philadelphia, I chose to piggyback of Chase Stuart’s draft value chart for how to craft my own more suitable to the NSFL. When I decided to start looking at ELO, I decided to borrow some methodology from FiveThirtyEight.

One of the factors FiveThirtyEight looks at in their ELO rating for NFL teams is a “quarterback adjustment.” They attempt to assign a value to each NFL QB that is relative to their individual performance that then adjusts their teams ELO up or down depending on the situation. The example FiveThirtyEight uses is how the Green Bay Packers are a much different team with Aaron Rodgers starting than with his backup. Aaron Rodger’s “quarterback adjustment” will be higher than his backups, meaning the Packers ELO will be higher when he’s starting.

The formula for quarterback adjustment that they utilize is described as “a regression between ESPN’s Total QBR yards above replacement and basic box score numbers.” It looks a little something like this:

VALUE = -2.2*Pass Attempts + 3.7*Completions + (Passing Yards/5) + 11.3*Passing TDs – 14.1*Interceptions – 8*Times Sacked – 1.1*Rush Attempts + 0.6*Rushing Yards +15.9*Rushing TDs

They have some methodology behind how they derived this formula available, but what I want to focus on is how this concept of VALUE. They adjust the metric based on the defensive quality of the opponents that a quarterback is facing by calculating the league average VALUE given up and then subtracting a defense’s specific average VALUE from that.

Essentially, they have formulated a way to calculate how strong a defense is at containing opposing quarterbacks. They don’t use it in that way full stock – they feed it back into their ELO ranking – but I got very interested in this metric and decided to calculate it for the NSFL. At some point I’ll continue the process of deriving an ELO system for the NSFL, but for now I want to focus on VALUE.

[div align=\\\"center\\\"]What I Found[/div]

The first season I calculated league wide average VALUE for and team average VALUE for was Season 20. Each team has a “Total Value” statistic that inputs data into the formula for every game from their season. Then I adjusted that metric to find a per game average VALUE by dividing it by 13. I then averaged all the team’s adjusted metric to find the league’s average per game VALUE. From there I found the difference between the league average and a team’s average – a metric I labeled in my charts as GAV Difference (game adjusted VALUE difference).

A team with a negative GAV Diff. does a better job in containing opposing quarterbacks than the average team. A team with a positive GAV Diff. does a worse job in containing opposing quarterbacks than the average team. The more extreme a team is on one end of the scale the either better or worse job their doing.

[div align=\\\"center\\\"]This is the chart for Season 20: [/div]

[div align=\\\"center\\\"][Image: Bn0iPzV.png][/div]

When I started to examine the data after I got it all calculated what I found was that the two teams with the most positive GAV Diff. – the Butchers and the Sabercats – were also the two teams who didn’t make the playoffs. The team with the second most negative GAV Diff. – the Copperheads – was the team that won the Ultimus.

From this point I grew interested in whether this was just an outlier situation or whether GAV Diff was fairly predictive of who would make the playoffs and who would succeed in them. Logically, if your team is stronger at containing quarterbacks than other teams, you’ll be a better team on whole, but the real test of this metric would be to see if it seemed to hold as a marker of success.

[div align=\\\"center\\\"]The next season I calculated was Season 19:[/div]

[div align=\\\"center\\\"][Image: gRXpuol.png][/div]

Once again, the Butchers and the Sabercats had the most positive GAV Diff. and were the two teams who didn’t make the playoffs. The team with the lowest GAV Diff. – the Orange County Otters – won the Ultimus.

[div align=\\\"center\\\"]Season 18: [/div]

[div align=\\\"center\\\"][Image: gY5pSKj.png][/div]

This is where the trend breaks. The two teams who didn’t make the playoffs were the Sabercats and Liberty. The Otters – the eventual Ultimus Champions – had the second most negative GAV Diff., while their opponent in the Ultimus – the Wraiths – had the highest. This season breaks the trend but what it highlighted for me after the two previous seasons was that defense and containing the quarterback were never going to be the only thing that factored into team success. For teams like the Liberty in Season 18 who had a negative GAV Diff. and did a good job in containing opposing quarterbacks it must’ve been their offense who let them down.

[div align=\\\"center\\\"]Season 17: [/div]

[div align=\\\"center\\\"][Image: 03Y5bK4.png][/div]

In Season 17 the trend began to reestablish itself. The Second Line and Liberty did not make the playoffs this season and they had two of the three most post GAV Differences. The other team with a highly positive GAV Diff. was the Yeti who barely scraped into the playoffs with five wins. The trend was not as strong for those with negative GAV Differences – the Otters and Hawks matched up in the Ultimus with neither having the most negative metric.

[div align=\\\"center\\\"]Season 16: [/div]

[div align=\\\"center\\\"][Image: 03Y5bK4.png][/div]

In Season 16 the Yeti and Copperheads missed the playoffs – both of whom had positive GAV Differences. The Outlaws and the Liberty – both of whom had negative GAV Differences - matched up in the Ultimus.

[div align=\\\"center\\\"]Conclusions [/div]

1. The trend of teams with the most positive GAV Differences being the ones to miss the playoffs and the most negative GAV Differences being the ones to contend for the championship is stronger in recent seasons. I think this is somewhat to do with Seasons 16 and Seasons 17 to be seasons with a high number of rookies due to recent expansions and also do to changing in the way offenses functioned in the league.
2. No trend is ever going to work one hundred percent of the time. What is apparent is there is some correlation between the metric and team success, meaning it does do a fair job of describing how one’s defense does. In that line of thinking teams who have highly negative GAV Differences that don’t experience success are probably let down by their offense while teams with highly positive GAV Differences but experienced some success were boosted by their offenses. There is a give and take between both of those things.
3. Some of this also depends on the matchups a team has. Not every team plays the same schedule.

[div align=\\\"center\\\"]So how about this season…?[/div]

All of this left me wondering which teams are being let down or boosted by their offenses, so I charted Season 21.

[div align=\\\"center\\\"][Image: hC36HdZ.png][/div]

It is not surprising to me that the three teams that are 4-0 have the most negative GAV Differences. What is interesting are teams like the Sabercats and Yeti who are 1-3 but both have negative GAV Differences. It would appear that they are being let down by their offenses to some extent. This looks like the case for the Sabercats who have the lowest total yards in the league with 1334. The Yeti have been performing in the middle of the pack in terms of total yards so they look like a team who has played a fairly tough schedule and is primed to bounce back as the year progresses.

The same inevitably goes for teams like the Copperheads, Outlaws, and Wraiths. The Copperheads and Wraiths are being boosted by their offenses while the Outlaws - with 0 wins but a less positive GAV Difference - are being let down by theirs.

[div align=\\\"center\\\"]Conclusions 2.0[/div]

1. You can use GAV Difference to examine how strong your team is at containing opposing quarterbacks
2. It is not a foolproof metric and does have some heavy weight on how many sacks and interceptions your team accrues.
3. It is not the end-all-be-all of determining playoff success, but it is a good metric to include in figuring out if your team has what it takes.
4. It is a metric that can be used to diagnose where your team is struggling. It can help identify if you have a weak defense, offense, or if you’ve just had bad sim luck or a tough schedule.

[div align=\\\"center\\\"]So why did I release this (and notes)[/div]

I want to take a minute to circle back around to the discussion that I started this article with. In the last few weeks I’ve been methodically releasing stat heavy articles that I hoped would have an impact on how teams in the league choose to operate or view situations that they find themselves in. I’m passionate about doing it – I wouldn’t be writing a 2,500 piece with heavy statistical background every other day if I wasn’t – and I think it showcases my dedication to the league.

Today I had several people message me on Discord after the announcement about league expansion and ask if I had considered throwing my hat in the ring to potentially be a candidate for a GM/Co job with one of the new teams. To be honest it hadn’t really crossed my mind. I think there is plenty that I have to learn about how the sim engine operates, how the league operates, and what makes players and team successful. I’m relatively new and never wanted to be presumptuous or step on toes – I’m happy to just be giving valuable content to the community.

Then I did some thinking and decided that I have nothing to lose by putting it out there that I’d potentially be interested in an opportunity if it presents itself. It is something I’m now fully exploring and something I’m going to put the same drive behind as I have my articles over the last few weeks. I think a team that is essentially a blank slate is a perfect place for me to try my hand at proving my theories and seeing if you can win with a statistical, value driven approach.

What I lack in pursuing this is experience, so if there is anyone who is experienced in how the league operates and is interested in having a data oriented and motivated partner I’m here to listen. If an opportunity presents itself, I’m here to work.

If it doesn’t or something doesn’t work out then the same message applies to teams who may be interested in drafting me in the upcoming NSFL draft – I’m looking to contribute. I’m motivated and opinionated. I’m going to continue to look for the optimal ways to do things in this league and I couldn’t be more excited for the future.
[div align=\\\"center\\\"]
So really – why did I release this?
[/div]

Hopefully this can give teams a good snapshot of where they are in the current season. It adds to my portfolio of previous stats work and hopefully shows my breadth of looking at a lot of the concepts that go into running a successful organization – from how to build a team, to how to build a draft, to how to approach transactions between teams, to how to value players – I’m trying to come up with innovative ways to approach all of those things.

And honestly, I just thought the math was cool and it would be interesting to write about.

As always – go check my work if you like – you can find it here. It took a LOT of time to collect nine stats for ten teams for thirteen games each year. Thats 1170 data points each season. Across five full seasons thats 5850 data points. Including the four games this season its 6210.

I'm also going to try to keep the local spreadsheet I have of this updated throughout the season so if you're someone who is interested in this and wants to know how your team is doing don't hesitate to DM me on Discord. I'll probably have a follow-up to this post after the season is over to look at how everything shook out.


*The Math Behind Containing Opposing Quarterbacks - ScorpXCracker - 03-18-2020

i am so confused


*The Math Behind Containing Opposing Quarterbacks - iStegosauruz - 03-18-2020

(03-18-2020, 12:08 AM)ScorpXCracker Wrote:i am so confused

How so? If I didn't explain something properly I need to fix it.


*The Math Behind Containing Opposing Quarterbacks - Warner - 03-18-2020

wha


*The Math Behind Containing Opposing Quarterbacks - W.Sconnie - 03-18-2020

Another great article, I'm continually amazed at the sheer volume of data you've been collecting. Finding fact driven ways to measure where teams need to build is invaluable. This has become my favorite statistic driven series on the site.

Scorp sucks or whatever


*The Math Behind Containing Opposing Quarterbacks - iStegosauruz - 03-18-2020

(03-18-2020, 02:38 AM)W.Sconnie Wrote:Another great article, I'm continually amazed at the sheer volume of data you've been collecting. Finding fact driven ways to measure where teams need to build is invaluable. This has become my favorite statistic driven series on the site.

Scorp sucks or whatever

Thanks for the feedback man - always happy when people enjoy the work too. Gonna be taking a breather for a day or two and force myself not to touch the sim engine or the indexes but I've got some pretty cool projects in the work too.


*The Math Behind Containing Opposing Quarterbacks - Mooty99 - 03-18-2020

This is another fascinating article, well done a great read, love it!


*The Math Behind Containing Opposing Quarterbacks - Opera_Phantom - 03-18-2020

This is fantastic, again.

And again, i feel like if you continue with this, I might ask for a diploma.


*The Math Behind Containing Opposing Quarterbacks - Raven - 03-18-2020

Absolutely killing it with these articles.


*The Math Behind Containing Opposing Quarterbacks - Troen - 03-18-2020

(03-17-2020, 10:05 PM)iStegosauruz Wrote:This season breaks the trend but what it highlighted for me after the two previous seasons was that defense and containing the quarterback were never going to be the only thing that factored into team success. For teams like the Liberty in Season 18 who had a negative GAV Diff. and did a good job in containing opposing quarterbacks it must’ve been their offense who let them down.

The natural other suggestion (to me at least) is that the teams could have had a poor run defense - containing the QB doesn't help as much if you're giving up 200 rushing yards/game. Certainly possible it's more on the offense, but...
(I'm assuming the rushing attempts/yards/TDs in the formula are by the QB only)

And actually, looking at Liberty in S18, they were 7th overall in points scored/game, 4th overall in points allowed/game, 7th overall in rushing yards allowed/game, and (for completeness) 6th overall in passing yards allowed/game. They also had 4 1-score losses and finished 5th with 2 fewer wins than the 2nd place team in the conference, which is interesting.
(Edit: So I'd say the numbers don't really support my hypothetical, shrug)