03-27-2024, 01:57 PM
(This post was last modified: 03-28-2024, 09:46 PM by wetwilleh. Edited 1 time in total.)
Why should I care?
Most surprising results
The data
I collected the last 3 years of player level defensive stat data for the NFL from https://www.pro-football-reference.com/ (2023, 2022, 2021) and ISFL from https://index.sim-football.com/ (S46, S45, S44). Positions needed to be adjusted to align, sometimes this took judgment. Below is a summary of some of the choices made:
For the most part stat definitions aligned as far as I can tell. Pro Football reference includes full sacks in their TFL number, and it appears ISFL does not. I recalculated TFL as TFL minus sacks for the NFL data but that will still have some inaccuracy between definitions (things like partial sacks, forcing QB out of bounds). I think the results are still informative directionally. 'PD' is another stat which can have slightly different definitions. Pro football reference reports 'passes defended' which includes incompletions caused by hits concurrent with catch attempts, it is possible the definition used in the ISFL is different which may affect results. For example, some sites only count batted balls as a pass defended if a receiver was in the area. I do not believe it is possible to do the research to conclusively find the definitions used in the sim, but would be interested to hear if anyone else has explored these definitions.
Frequency of events
The nature of NFL football involves more substitutions and more total players per game than the ISFL. You can see this effect by comparing the number of games started to the number of games by position. Whether this is situational player personnel packages or more frequent rotation at certain positions, this could affect the results reported.
Because of that I believe reporting NFL stats on a per game started basis is preferred, while ISFL is reported on the only stat available games. Each of those metrics are the closest to the theoretical value of games played*2*11. Here is a table of the results of these defensive events either on a per player game basis or per team game basis (the number you may more naturally anchor to) with a chart showing the relative proportion dividing by the preferred per game started metric in the NFL. A value of 100% would show up on the 100% line (no column height) and mean an equal ratio between NFL and ISFL. The axis is plotted on a log scale, equal linear distance to reflect 1/10x or 10x frequency.
As you can see tackles and TFL occur relatively less frequently in the ISFL than in the NFL, while sacks, passes defended and forced fumbles occur more frequently. Interceptions and fumble recoveries (by defensive players) are roughly in line between the two leagues. Far and away the largest aberration is the much higher frequency of safeties in the ISFL (5.95 safeties in the ISFL for every safety in the NFL). I would guess that a large part of that is the failure of many NFL safeties to result in a safety stat awarded to a defensive player. While the ISFL sim engine when it determines a safety occurs will be more likely to tie that to an action from the defending player. Team level differences are not quite as large. All 45 safeties in the last 3 seasons generated a stat for a defensive player while only 17 of 38 safeties generated a stat for a defensive player in the NFL, in other words at the team level the frequency of safeties is roughly half, 2.876 ISFL safeties for every 1 NFL safety.
Distribution of events by position
First let's take a look to see how closely the total games played by position match between leagues. It wouldn't be a huge issue if these were out of line, but it would affect our ability to look at other events if one or the other league showed a much higher prevalence of one position and they are accumulating stats simply from having more bodies on the field.
We see that generally the two leagues field a similar distribution of players. When we look at our highest volume counting stat, tackles we again see pretty close alignment. Though a slight increase in tackles made by LB and slight decrease in tackles made by DE in the ISFL relative to the NFL can be observed. This is partially inflated by the player count at these positions, but I think the discrepancy is enough to be worth noting. We can also compare production within leagues and that's perhaps where the discrepancy in tackling distribution would be felt more, for every tackle a DE makes in the NFL a LB makes 2.58 tackles, while for every tackle a DE in the ISFL makes a LB makes 5.08.
Now lets look at taking people to ground behind the line of scrimmage. LB's reign supreme, while secondary positions see mixed results, performing relatively poorly on TFL, but better in racking up sacks when compared to their NFL benchmark. In the NFL DE and LB sack the QB in roughly equal proportion, while in the ISFL DE get fewer sacks than even DT and LB's rack up the lion's share of this sexy stat.
These are the first two glaring discrepancies that seem like they shouldn't be so extreme. Nearly 0 TFL from the secondary and a healthy portion of DE sacks being ripped away. Anecdotally from the games I've watched part of the TFL discrepancy could be the lower frequency of passes thrown behind the line of scrimmage, screens or quasi-run/shovels. Similarly I've seen less press coverage in the ISFL than is typical in the NFL. It is interesting that the sack numbers are still strong for the secondary, maybe this is mostly an artifact of once the sim decides "RUN" or maybe specifically "Run for loss" it decides defensive backs will not contribute to the play.
The defensive end sack shortfall is harder for me to explain. Part of this phenomenon can be a vicious cycle, where a slight advantage for LB relative to DE leads to people investing more heavily into LB, or being more likely to attrite as a mediocre performing defensive end. More seasoned members of the league may have other thoughts and opinions on if this discrepancy is even worth worrying about. On one hand I think it is nice that DE and DT perform relatively close to one another. But on the other its another metric where LB are far and away racking up the most stats.
How about defensive actions with the ball in the air? For the most part things are in line between leagues, with the notable exception of defensive linemen in the ISFL. 0 interceptions, 0 passes defended for defensive ends and tackles in the ISFL.
For interceptions this is basically a rounding error (though I would argue few events in football are more fun than watching a defensive lineman return an interception for a TD or otherwise), but passes defended (or defensed or deflected broken up or whatever the outfit choses to report) is a non trivial way in which linemen contribute toward pass defense. Maybe this is a purposeful choice and defensive lineman blocking passes was toxic for some reason, or maybe this is simply a matter of the way stats are recorded and in the ISFL, but I thought this was noteworthy as an event which is generating actual 0 stats in a place where we would expect at least some activity. If there is some hard coded sim behavior disallowing the D line from generating pass stats why wouldn't you ever play a LB instead of a lineman when possible especially given the way they rack up other stats.
The lower interceptions for SS maybe doesn't jump off the page as strongly, but I think also deserves discussion. INTs are one of the highest impact defensive events so a rough halving of the share being brought back by a position is material to their contribution. In the NFL for every int a SS has a FS has 2.2, while in the ISFL for every int a SS brings back a FS nabs 5.4. Considering their performance on other stat generating metrics its hard to say why you'd ever bring a SS instead of FS.
Forced fumbles, fumble recoveries, and safeties scored are shown below mostly for the sake of completeness. We are dealing with rarer events (though high impact ones) here, so it is hard to say anything definitively. The general trend of DE taking a haircut to the benefit of LB persists.
Distribution across players
We've looked at how events accumulate across positions, but we might also be interested in how they are distributed across players. The histograms below use a qualifying definition of 5 games started. They show the frequency on a player level of different tackles per game rates achieved over the course of a single qualifying season.
I chose to break out the results by LB and non-LB defenders. As I think this helps illuminate one interesting trend. In the ISFL compared to the NFL under this qualifying definition accumulating <=1 tackle per game is much more common for non-LB at least. The shape of the histogram for LB in the ISFL is pretty surprising showing a heavy weight in the middle, compared to the more uniform shape of the NFL. This could be due to some pareto/power law type behavior that can happen in the NFL (whether it be by larger talent gaps, role in the defense, or simple ability to consistently stay on the field) where performances would be more likely to fall in one of the tails. A purer, more equitable probabilistic approach like the sim might use would be more likely to result in this sort of distribution with more results around the mean.
But honestly I think this is the analysis that will be most highly impacted by the differences in substitution behavior. Different qualifying filters could be used such as one based directly on a minimum stat/game, but that's part of what I wanted to examine so I tried to use another metric. Investigating the very low tackle per game positions further we see that 26 out of the 38 ISFL players notching seasons with 1 or less tackle/game were defensive ends. Comparing the distribution across players at the DT and DE position, we see DT behavior is more similar to LB (in shape but at roughly half volume), while defensive ends have a large number grouped <=1. Looking at these on a case by case basis we see these players generating other stats at a reasonable clip so its not that they are not on the field. Just that specifically tackles are not flowing to that position group. My best guess (validated by some anecdotal research) is these are mainly in 3 man fronts and you can see in this situation a large number of the stats (tackles and sacks in particular) flowing mainly to the linebackers.
This phenomenon of having a distribution thicker around the middle for the ISFL has some nice things about it. First, its a pretty natural result in a situation where substitutes and injuries are less common. I don't think anyone is clamoring to simulate injuries or needing to field larger rosters with subs that will play only a fraction of the snaps. But this lower variance of results does mean anytime we see some statistical advantage it is more likely that will hold. One way to think about this, the fail case on a LB tackle/game is higher. In the ISFL 70% of LB's will achieve 5 or more tackles per game in a season while only 15% of non-LB will match that. In the NFL 48% of LB would expect a 5+ tackle/game season, while 19% of non-LB would. At least at the season level we are less likely to see an outlier defying the odds in the ISFL relative to the NFL - and those sort of outlier events can be fun.
Conclusions, limitations and further areas for investigation
So we return to the question of how should these stats be distributed. Simulation is in the name, but at the end of the day it is a game and it is fun to see entertaining outcomes. I think that certain events happening more often than in the NFL is a good thing within reason. With regard to how these stats should be distributed across positions, the dominance of LB may be the outcome of meta choices more than limitations of the system. I think some of the numbers I have seen are extreme enough to discuss, but they also are not inherently bad for the game. If people do want to throw some juice toward a position it looks like DE or SS are the positions that deserve some love, but neither is in an entirely dire situation.
As far as I have been able to piece together in my limited time here, it feels like some of these findings have already been internalized within the community. With preference shown to 3 man fronts, DT over DE, and FS over SS. As far as that is true it could exacerbate these results. While we didn't see any significant deviation in the % of games by position it could be that people seeing higher stats are more motivated to continue pushing their TPE up. I'd like to add that component to future analysis to see how TPE is spread across these positions and the influence on stats earned. Likewise, it is important to consider the schemes employed, the actual plays run. The TFL discrepancy for the secondary is the largest aberration from NFL benchmark, but if that is because screen passes do not work well in the sim or that press coverage gets toasted by speedy receivers than it seems like a reasonable response. If you play something like madden it is not like people make play calls in line with the NFL expectation, so realistically in many ways these stats are an affirmation of the strong quality of the simulation.
- Numbers going up is fun. Are some positions having more fun than others? More or less than they should?
- Team composition. Do some positions perform better owing to some systematic bias? Could your team get better defensive results stacking certain positions?
- Gambling! Need an IDP in fantasy? Props involving defensive stats? Take a gander below.
Most surprising results
- Safeties (resulting in a defensive player stat) occur much more often in the ISFL than in the NFL (5.95:1 per game).
- Defensive backs make 31% of Tackles for Loss in the NFL, 2% in the ISFL.
- Defensive linemen make 14% of Passes Defended in the NFL, 0.00% in the ISFL.
- Linebackers in the ISFL consistently outperform their NFL benchmarks. Performance is distributed more centrally around the mean than in the NFL.
- Defensive Ends and to a lesser extent Strong Safeties in the ISFL underperform their NFL benchmarks.
The data
I collected the last 3 years of player level defensive stat data for the NFL from https://www.pro-football-reference.com/ (2023, 2022, 2021) and ISFL from https://index.sim-football.com/ (S46, S45, S44). Positions needed to be adjusted to align, sometimes this took judgment. Below is a summary of some of the choices made:
- There was a high degree of overlap between DE and LB for EDGE type players in the NFL accounting. I coded any position with a mention of LB as a LB.
- The 'DL' position as categorized in the NFL accounting is a mixed bag including Aiden Hutchinson and Nick Bosa, but also Quinnen Williams and Dexter Lawrence. I coded these as DE.
- Many safety roles in the NFL accounting are given hybrid hyphenations. I only counted pure ‘SS’ as a SS, other roles were coded as ‘FS.’
For the most part stat definitions aligned as far as I can tell. Pro Football reference includes full sacks in their TFL number, and it appears ISFL does not. I recalculated TFL as TFL minus sacks for the NFL data but that will still have some inaccuracy between definitions (things like partial sacks, forcing QB out of bounds). I think the results are still informative directionally. 'PD' is another stat which can have slightly different definitions. Pro football reference reports 'passes defended' which includes incompletions caused by hits concurrent with catch attempts, it is possible the definition used in the ISFL is different which may affect results. For example, some sites only count batted balls as a pass defended if a receiver was in the area. I do not believe it is possible to do the research to conclusively find the definitions used in the sim, but would be interested to hear if anyone else has explored these definitions.
Frequency of events
The nature of NFL football involves more substitutions and more total players per game than the ISFL. You can see this effect by comparing the number of games started to the number of games by position. Whether this is situational player personnel packages or more frequent rotation at certain positions, this could affect the results reported.
Because of that I believe reporting NFL stats on a per game started basis is preferred, while ISFL is reported on the only stat available games. Each of those metrics are the closest to the theoretical value of games played*2*11. Here is a table of the results of these defensive events either on a per player game basis or per team game basis (the number you may more naturally anchor to) with a chart showing the relative proportion dividing by the preferred per game started metric in the NFL. A value of 100% would show up on the 100% line (no column height) and mean an equal ratio between NFL and ISFL. The axis is plotted on a log scale, equal linear distance to reflect 1/10x or 10x frequency.
As you can see tackles and TFL occur relatively less frequently in the ISFL than in the NFL, while sacks, passes defended and forced fumbles occur more frequently. Interceptions and fumble recoveries (by defensive players) are roughly in line between the two leagues. Far and away the largest aberration is the much higher frequency of safeties in the ISFL (5.95 safeties in the ISFL for every safety in the NFL). I would guess that a large part of that is the failure of many NFL safeties to result in a safety stat awarded to a defensive player. While the ISFL sim engine when it determines a safety occurs will be more likely to tie that to an action from the defending player. Team level differences are not quite as large. All 45 safeties in the last 3 seasons generated a stat for a defensive player while only 17 of 38 safeties generated a stat for a defensive player in the NFL, in other words at the team level the frequency of safeties is roughly half, 2.876 ISFL safeties for every 1 NFL safety.
Distribution of events by position
First let's take a look to see how closely the total games played by position match between leagues. It wouldn't be a huge issue if these were out of line, but it would affect our ability to look at other events if one or the other league showed a much higher prevalence of one position and they are accumulating stats simply from having more bodies on the field.
We see that generally the two leagues field a similar distribution of players. When we look at our highest volume counting stat, tackles we again see pretty close alignment. Though a slight increase in tackles made by LB and slight decrease in tackles made by DE in the ISFL relative to the NFL can be observed. This is partially inflated by the player count at these positions, but I think the discrepancy is enough to be worth noting. We can also compare production within leagues and that's perhaps where the discrepancy in tackling distribution would be felt more, for every tackle a DE makes in the NFL a LB makes 2.58 tackles, while for every tackle a DE in the ISFL makes a LB makes 5.08.
Now lets look at taking people to ground behind the line of scrimmage. LB's reign supreme, while secondary positions see mixed results, performing relatively poorly on TFL, but better in racking up sacks when compared to their NFL benchmark. In the NFL DE and LB sack the QB in roughly equal proportion, while in the ISFL DE get fewer sacks than even DT and LB's rack up the lion's share of this sexy stat.
These are the first two glaring discrepancies that seem like they shouldn't be so extreme. Nearly 0 TFL from the secondary and a healthy portion of DE sacks being ripped away. Anecdotally from the games I've watched part of the TFL discrepancy could be the lower frequency of passes thrown behind the line of scrimmage, screens or quasi-run/shovels. Similarly I've seen less press coverage in the ISFL than is typical in the NFL. It is interesting that the sack numbers are still strong for the secondary, maybe this is mostly an artifact of once the sim decides "RUN" or maybe specifically "Run for loss" it decides defensive backs will not contribute to the play.
The defensive end sack shortfall is harder for me to explain. Part of this phenomenon can be a vicious cycle, where a slight advantage for LB relative to DE leads to people investing more heavily into LB, or being more likely to attrite as a mediocre performing defensive end. More seasoned members of the league may have other thoughts and opinions on if this discrepancy is even worth worrying about. On one hand I think it is nice that DE and DT perform relatively close to one another. But on the other its another metric where LB are far and away racking up the most stats.
How about defensive actions with the ball in the air? For the most part things are in line between leagues, with the notable exception of defensive linemen in the ISFL. 0 interceptions, 0 passes defended for defensive ends and tackles in the ISFL.
For interceptions this is basically a rounding error (though I would argue few events in football are more fun than watching a defensive lineman return an interception for a TD or otherwise), but passes defended (or defensed or deflected broken up or whatever the outfit choses to report) is a non trivial way in which linemen contribute toward pass defense. Maybe this is a purposeful choice and defensive lineman blocking passes was toxic for some reason, or maybe this is simply a matter of the way stats are recorded and in the ISFL, but I thought this was noteworthy as an event which is generating actual 0 stats in a place where we would expect at least some activity. If there is some hard coded sim behavior disallowing the D line from generating pass stats why wouldn't you ever play a LB instead of a lineman when possible especially given the way they rack up other stats.
The lower interceptions for SS maybe doesn't jump off the page as strongly, but I think also deserves discussion. INTs are one of the highest impact defensive events so a rough halving of the share being brought back by a position is material to their contribution. In the NFL for every int a SS has a FS has 2.2, while in the ISFL for every int a SS brings back a FS nabs 5.4. Considering their performance on other stat generating metrics its hard to say why you'd ever bring a SS instead of FS.
Forced fumbles, fumble recoveries, and safeties scored are shown below mostly for the sake of completeness. We are dealing with rarer events (though high impact ones) here, so it is hard to say anything definitively. The general trend of DE taking a haircut to the benefit of LB persists.
Distribution across players
We've looked at how events accumulate across positions, but we might also be interested in how they are distributed across players. The histograms below use a qualifying definition of 5 games started. They show the frequency on a player level of different tackles per game rates achieved over the course of a single qualifying season.
I chose to break out the results by LB and non-LB defenders. As I think this helps illuminate one interesting trend. In the ISFL compared to the NFL under this qualifying definition accumulating <=1 tackle per game is much more common for non-LB at least. The shape of the histogram for LB in the ISFL is pretty surprising showing a heavy weight in the middle, compared to the more uniform shape of the NFL. This could be due to some pareto/power law type behavior that can happen in the NFL (whether it be by larger talent gaps, role in the defense, or simple ability to consistently stay on the field) where performances would be more likely to fall in one of the tails. A purer, more equitable probabilistic approach like the sim might use would be more likely to result in this sort of distribution with more results around the mean.
But honestly I think this is the analysis that will be most highly impacted by the differences in substitution behavior. Different qualifying filters could be used such as one based directly on a minimum stat/game, but that's part of what I wanted to examine so I tried to use another metric. Investigating the very low tackle per game positions further we see that 26 out of the 38 ISFL players notching seasons with 1 or less tackle/game were defensive ends. Comparing the distribution across players at the DT and DE position, we see DT behavior is more similar to LB (in shape but at roughly half volume), while defensive ends have a large number grouped <=1. Looking at these on a case by case basis we see these players generating other stats at a reasonable clip so its not that they are not on the field. Just that specifically tackles are not flowing to that position group. My best guess (validated by some anecdotal research) is these are mainly in 3 man fronts and you can see in this situation a large number of the stats (tackles and sacks in particular) flowing mainly to the linebackers.
This phenomenon of having a distribution thicker around the middle for the ISFL has some nice things about it. First, its a pretty natural result in a situation where substitutes and injuries are less common. I don't think anyone is clamoring to simulate injuries or needing to field larger rosters with subs that will play only a fraction of the snaps. But this lower variance of results does mean anytime we see some statistical advantage it is more likely that will hold. One way to think about this, the fail case on a LB tackle/game is higher. In the ISFL 70% of LB's will achieve 5 or more tackles per game in a season while only 15% of non-LB will match that. In the NFL 48% of LB would expect a 5+ tackle/game season, while 19% of non-LB would. At least at the season level we are less likely to see an outlier defying the odds in the ISFL relative to the NFL - and those sort of outlier events can be fun.
Conclusions, limitations and further areas for investigation
So we return to the question of how should these stats be distributed. Simulation is in the name, but at the end of the day it is a game and it is fun to see entertaining outcomes. I think that certain events happening more often than in the NFL is a good thing within reason. With regard to how these stats should be distributed across positions, the dominance of LB may be the outcome of meta choices more than limitations of the system. I think some of the numbers I have seen are extreme enough to discuss, but they also are not inherently bad for the game. If people do want to throw some juice toward a position it looks like DE or SS are the positions that deserve some love, but neither is in an entirely dire situation.
As far as I have been able to piece together in my limited time here, it feels like some of these findings have already been internalized within the community. With preference shown to 3 man fronts, DT over DE, and FS over SS. As far as that is true it could exacerbate these results. While we didn't see any significant deviation in the % of games by position it could be that people seeing higher stats are more motivated to continue pushing their TPE up. I'd like to add that component to future analysis to see how TPE is spread across these positions and the influence on stats earned. Likewise, it is important to consider the schemes employed, the actual plays run. The TFL discrepancy for the secondary is the largest aberration from NFL benchmark, but if that is because screen passes do not work well in the sim or that press coverage gets toasted by speedy receivers than it seems like a reasonable response. If you play something like madden it is not like people make play calls in line with the NFL expectation, so realistically in many ways these stats are an affirmation of the strong quality of the simulation.