In this exercise, I attempted to visualize and place a number on the actual impacts of the offensive line on the ability of quarterbacks to produce through the air. In this dataset, I extrapolated all of the data for DSFL quarterbacks in S23 up to this point, and regressed them on top of relevant statistics of their offensive line. For quarterbacks, the relevant statistics that I looked at include total completions, total attempts, total yards, completion percentage, longest throw, total touchdowns, total interceptions, and QB rating. The statistics that I considered relevant for the offensive line were overall, strength, agility, arm, intelligence, speed, pass blocking, run blocking, and endurance. Because different teams had different numbers of offensive lineman, I had to find a way to standardize all of the offensive lines. I did so by only collecting the attributes of the top two offensive guards, top two offensive tackles, and top center. This was chosen largely because this is your most common setup for the offensive line. However, because of adverse selection, this data may not be fully representative. Another stipulation that I enforced for myself in this experiment was that data from quarterbacks were only considered valid if they had thrown over 150 attempts so far in the season. Out of all of the teams, only one team did not have a clear starter that had over 150 total attempts, which was Norfolk. Instead, they had two quarterbacks that split the game 50/50, with 1 QB taking 130 attempts and the other having 120 attempts. For our calculations, their statistics were then combined to create one “starting quarterback”. One major part of this experiment that should have an asterisk next to it is that fact that I had to work with limited data. Thus, none of the test statistics came out to be greater than 1.96. Strictly speaking, this is not a super statistically significant study, and is more like some incoherent scribbles on the back of a napkin. However, I don’t think that this should be regarded as completely useless, as it definitely gives some insight into how exactly every single one of the offensive line attributes translates into offensive statistics.
The first statistic that I took a look at was that of offensive line overall. My main hypothesis was that the vast majority of relevant quarterback statistics would likely be positively correlated with this statistic. That is saying, a line with a higher overall would result in the QB having higher relevant statistics. I then went and did a regression in order to test my hypothesis. In this experiment, I used Stata. My first step was to regress total completions over offensive line overall. Based on the regression, the coefficient of interest came out to be 0.021, meaning that for every single overall point increase in the offensive line, a quarterback could be projected to have one more completion. This would be fairly interesting to me, but due to the small sample size would be something that I would want to look at and research further with possibly more samples. I would not consider this super significant or even relevant, because it makes logical sense that a team with a higher overall would likely result in the QB throwing more completions. Additionally, the fact is that the entire offensive line would have to increase their overalls by 5 total, or 1 apiece in order for the quarterback to increase their completions by one, which means that the magnitude of impact is relatively small.
The second statistic that I regressed over the offensive line overall was total yards. This was the main focus of my regression, because I was honestly the most curious about the impact of the offensive line on QB yards. My hypothesis was once again confirmed once I did the regression, and there was an obviously positive correlation between the overall of an offensive line and the total yards of a quarterback. In this regression, the coefficient of interest came out to be 2.67. This means that for every single overall point of an offensive line resulted in 2.67 additional yards by a quarterback. Personally, this was pretty significant to me. Given the fact that offensive linemen are able to improve their overalls pretty fast at the beginning, the idea of diminishing marginal returns somewhat becomes relevant. Specifically, the idea that perhaps, for teams looking to invest in a lineman, it makes more sense for them to pay less for a mediocre lineman than to break the bank for a great lineman.
The final stat that seems to have been positively influenced and correlated with an increase in overall is touchdowns. Surprisingly, it seems that an increase in overall of the offensive line really has no negligible impact on the amount of interceptions thrown, the longest pass thrown, or the passer rating. This is somewhat logical but still surprising, as these statistics are likely the least influenced by a bad or a good offensive line, and are more influenced by the pure skills of the quarterback. For my regression, the coefficient of interest came out to be around 0.018, meaning that an increase of 50 would have to be imparted on the overall of the offensive line in order to increase the amount of TDs thrown by 1. This is obviously positively correlated, but honestly seems a lot less impactful than it actually is. If I were to be able to redo this entire experiment with a lot more information and statistics from past years, this would be one of the major points that I would focus on.
This is just the first part of a large amount of regressions that I plan on doing, but I very much feel as if it still gives a decent amount of insight into how impactful a decent offensive line actually is on the performance of the quarterback and offense as an extension. In future iterations of this experiment, I would hope to break the overall down into more specific statistics, and see things such as how strength specifically impacts the numbers of the quarterback. This could also be expanded to include things such as running back numbers.
The first statistic that I took a look at was that of offensive line overall. My main hypothesis was that the vast majority of relevant quarterback statistics would likely be positively correlated with this statistic. That is saying, a line with a higher overall would result in the QB having higher relevant statistics. I then went and did a regression in order to test my hypothesis. In this experiment, I used Stata. My first step was to regress total completions over offensive line overall. Based on the regression, the coefficient of interest came out to be 0.021, meaning that for every single overall point increase in the offensive line, a quarterback could be projected to have one more completion. This would be fairly interesting to me, but due to the small sample size would be something that I would want to look at and research further with possibly more samples. I would not consider this super significant or even relevant, because it makes logical sense that a team with a higher overall would likely result in the QB throwing more completions. Additionally, the fact is that the entire offensive line would have to increase their overalls by 5 total, or 1 apiece in order for the quarterback to increase their completions by one, which means that the magnitude of impact is relatively small.
The second statistic that I regressed over the offensive line overall was total yards. This was the main focus of my regression, because I was honestly the most curious about the impact of the offensive line on QB yards. My hypothesis was once again confirmed once I did the regression, and there was an obviously positive correlation between the overall of an offensive line and the total yards of a quarterback. In this regression, the coefficient of interest came out to be 2.67. This means that for every single overall point of an offensive line resulted in 2.67 additional yards by a quarterback. Personally, this was pretty significant to me. Given the fact that offensive linemen are able to improve their overalls pretty fast at the beginning, the idea of diminishing marginal returns somewhat becomes relevant. Specifically, the idea that perhaps, for teams looking to invest in a lineman, it makes more sense for them to pay less for a mediocre lineman than to break the bank for a great lineman.
The final stat that seems to have been positively influenced and correlated with an increase in overall is touchdowns. Surprisingly, it seems that an increase in overall of the offensive line really has no negligible impact on the amount of interceptions thrown, the longest pass thrown, or the passer rating. This is somewhat logical but still surprising, as these statistics are likely the least influenced by a bad or a good offensive line, and are more influenced by the pure skills of the quarterback. For my regression, the coefficient of interest came out to be around 0.018, meaning that an increase of 50 would have to be imparted on the overall of the offensive line in order to increase the amount of TDs thrown by 1. This is obviously positively correlated, but honestly seems a lot less impactful than it actually is. If I were to be able to redo this entire experiment with a lot more information and statistics from past years, this would be one of the major points that I would focus on.
This is just the first part of a large amount of regressions that I plan on doing, but I very much feel as if it still gives a decent amount of insight into how impactful a decent offensive line actually is on the performance of the quarterback and offense as an extension. In future iterations of this experiment, I would hope to break the overall down into more specific statistics, and see things such as how strength specifically impacts the numbers of the quarterback. This could also be expanded to include things such as running back numbers.