Motivation – My player is Mario Messi (S26), and he played in Pythons for two seasons, played for Hawks for 1.5 seasons, and just moved to Sailfish. Messi was a wide receiver in Hawks and Pythons, so I have been curious about the performance of a WR for a while. I write this article because it feels great to share some knowledge about WR with everyone and invite some critiques. Well truth be told, I am broke so I need several million dollars to change my player’s position from WR to TE...
Main Questions - A WR’s (or any player's) performance can be contributed by roughly speaking, three groups of factors: (1) training efforts, (2) wise decisions, and (3) sim randomness. Training efforts are measured by TPEs – if a player has more TPEs, it is much more likely for the player to deliver great performance. The obstacle on this front is our laziness – we need to motivate ourselves to work harder to earn TPEs. Conditioning on the same level of TPEs, wise decisions matter, including the allocation of TPEs, decisions of moving to different teams, etc. The obstacle to wise decisions is not hard working, but information asymmetry – the sim engine knows the whole system while individual players do not have enough information to make wise decisions. The last thing is about sim randomness, which everyone rightfully criticizes when they lose a game (although people tend to show less gratefulness to the sim randomness when they win a game…). Okay then, this structure leads to the main questions of this article:
• To what extent, do the hard work (TPEs) itself determine a WR’s performance?
• What is the optimum allocation of TPEs to attributes to maximize a WR’s performance?
• To what extent, is a single WR’s performance determined by the SIM randomness?
Aha, sometimes I like to babble on and on about statistical principles, modeling caveats, and many other technical things. I will mark them as TECH notes, so feel free to skip them.
Data sources – I downloaded the user attributes and receiving data sets at four different time points (S27 Week 2, S27 Week 5, S27 Week 16, and S28 Week 8). Putting them together, the sample size is relatively large (about 120 observations). The data set provides the information that is after S27, when the sim league changed its simulation engine. Hence all the insights are useful for this new sim environment. I downloaded players’ attributes and WRs’ performances from two different web pages, and as a result, I did some coding to match the player names of the two web pages. Somehow I was quite lazy with matching players’ names precisely, so the matching process loses some data points. (Well, it is not my fault because the two data sources use different formats for players’ names. Matching English words is a boring task.) [TECH Notes] - The ratio between model and data complexity determines the reliability of the model results. A classical result follows the central limit theorem with a root N convergence rate. The rule-of-thumb in practice is roughly 1/10 ratio – meaning that 1 variable requires about 10 observations. Intuitively, the league has around 14*3=42 WRs. When we have only 40 observations, the number of parameters can only be limited to about four, which is extremely limited. But the potential determinants are huge: about ten different build attributes, three archetypes, fourteen teams, teams’ formation and strategies, among many factors. A small data set cannot afford an in-depth analysis of so many factors. The solution is to collect multiple cross-sectional data from the Sim league, expanding the cross-sectional data to a panel. It is not a perfect solution, but slightly better.
Descriptive data
Here is the nutshell of data summary:
Total number of WRs: 42
Total number of observations: 117
Distribution of three Archetypes
The data set covers 42 unique WRs, and in total it has 117 observations. Out of the 117 observations, there are 58 speed receivers, 39 slot receivers, and 20 possession receivers. The speed receivers are the most popular archetype in the WR category. I guess it is because every WR looks forward to the happiness of running past the whole defensive lines and winning TDs.
Distribution of key variables: The variables can be divided into two categories: inputs and outputs. There are four output variables to measure the performance of WRs: yards per game (Yards_G), receivings per game (No_G), touchdowns per game (TD_G), and yards per receiving (Yards_No). The average values of the four variables are around 65 yards per game, 5 receivings per game, 0.35 TD per game, and 12 yards per catch. The mean values are close to the 50% percentiles, and the distributions are close to a symmetric one (like Gaussian/Binomial type).
Description of four evaluation metrics
Description of four evaluation metrics
The input variables are a long list of things: current TPEs, strength, agility, intelligence, speed, hands, endurance, competitiveness, deep threat (trait – 0/1 variable), athlete (trait – 0/1 variable), and slot receiving (trait – 0/1 variable). Just reading through the table below – the average values of the variables are: TPE – 803, Strength – 64, agility – 81, intelligence – 65, speed – 95, hand – 86, endurance – 77, competitiveness – 63, deep threat (27% purchased), athlete (14% purchased), and slot receiving (29% purchased). These average values do not suggest anything a good WR should or should not do. It just implies that if you have not earned 803 TPEs, you are not an average WR yet (like me) : (
Description of eleven input variables
The distributional information of the variables reveals richer details about the sim and cost structure. For example, the speed attribute has clear three-modal distributions, peaking at 90, 95, and 100. The three peaks are the maximum speed limits for the three archetypes (speed receiver at 100, slot receiver at 95, and possession receiver at 90). Hence every WR has either already reached the max speed limit or on the way to maximizing the speed. Hands is the same thing: clear peaks at 80, 85, and 90. It is because the speed WR’s hand limit is at 85 and the slot receiver’s limit is at 90. Sometimes the attribute distribution reveals the cost structure. Many attributes peak at 70 or 80, since the cost of investing in TPEs follows a piecewise linear form. In fact, the analysis below will show that all these practices are quite reasonable. Crowd-sourced wisdom through folklores already converges to the optimum strategy. A cool thing to note.
Distribution of eleven input variables
I ignored many factors. On the performance evaluation side, many other metrics can be valuable: completion rates, # of drops, fouls, etc. For example, Slate told me that fouls might heavily depend on players’ intelligence, and fouls have a negative impact on the whole team. Many input variables are also ignored, particularly those related to teams. Since the teams have a very different balance between running vs. rushing plays, individual WRs’ performance must heavily depend on the team strategies. However, I need to ignore the 14 discrete variables (teams), because of modeling concerns. Even if I can get some conclusions about the team strategies’ impacts on WR performance, the conclusions are typically not generalizable - Teams can change strategies from season to season, so it does not make sense to suggest all the WRs go to one team (e.g. Yeti, with the highest throwing attempt this season).
Okay. Let’s start to answer the three questions.
Q1 - TPE Impacts
The first thing is about the effectiveness of TPEs on WR performance. The conclusion is, well, hard work pays off. The figure below shows that TPEs are highly and positively correlated with TPEs. The upward trends are very clear when the TPEs are plotted with all four evaluation metrics (yards per game, receivings per game, TDs per game, and yard per receiving). Suppose a player grows from 400 TPE to 1400, the person’s performance is expected to increase from 30 yards/G to 100 yards/G, and from 4 receives/G to 7 receives/G, etc. It is not a surprise that hard work pays off. However, there are two notes.
One – the orange curves fit better than the linear ones. There is a clear saturation effect after the TPE reaches around 1000 TPEs. It seems quite reasonable, since all the most useful attributes already reach the maximum using 1000 TPEs,. The additional TPEs beyond 1000 seem to have a limited payoff in performance.
Four evaluation metrics with TPEs
Second, to what extent the hard work itself determines the performance? This question relies on the R square of the fitting functions (printed below). The R squares range between 10% to 35%, depending on the exact metric. Overall, we can conclude that hard training (TPEs) itself can determine around 10% to 35% of a WR’s performance. Is it high or low? Well…hard to judge, but it suggests that other 65%~90% of the performance depends on players’ wise decisions of allocating TPEs, choosing right teams, etc. as well as the SIM randomness.
------
Current TPEs fitting yards per game
Linear reg (linear terms) - R square is: 0.340
Linear reg (quadratic terms) - R square is: 0.360
------
Current TPEs fitting receivings per game
Linear reg (linear terms) - R square is: 0.241
Linear reg (quadratic terms) - R square is: 0.278
------
Current TPEs fitting touchdowns per game
Linear reg (linear terms) - R square is: 0.133
Linear reg (quadratic terms) - R square is: 0.132
------
Current TPEs fitting yards per receiving
Linear reg (linear terms) - R square is: 0.117
Linear reg (quadratic terms) - R square is: 0.121
Q2 – Wise WR Decisions with Regression and Optimization
The question about how to wisely allocate the TPEs needs to be answered with regressions and optimizations. We need to understand how each attribute contributes to the performance, and how to combine the benefits and the costs to figure out the most efficient use of TPEs. The regressions use the four evaluation metrics as outputs, and other independent variables as inputs. Here are the four results.
Yards/G ~ Input variables
Significant Vars: Strength (-0.5 negative?), Speed (+2.8), Hand (+0.8), Endurance (+1.44).
I obtained quite reasonable results except for strength. One unit of improvement in speed, hands, and endurance leads to 2.8, 0.8, and 1.44 units increase of yards/G. Hence to improve yards/G, players should prioritize speed > endurance > hands. The ratio of the three coefficients are roughly speaking: 3 : 1 : 2. This ratio represents the benefit of the attributes, which needs to be combined with the costs to identify the optimum TPE allocation strategy. I cannot believe the strength has a negative impact, although the -0.5 is not very large and P-value is not very small. The negativity of the coefficient might relate to the differences of the three archetypes – speed receivers tend to get high yards/G, and probably they won’t develop their strength much.
Receiving/G ~ Input variables
Significant vars: Strength (-0.05 negative again?), Agility (+0.05), Intelligence (-0.04), Speed (+0.09), Hand (+0.11), Endurance (+0.11).
Receivings per game positively depend on agility, speed, hands, and endurance. The results are also very intuitive. Compared to yards per game in which agility does not to matter, here agility starts to matter. Similar to yards, the three main attributes (speed, hands, and endurance) are still the most important variables. The relative importance of speed, hand, endurance, and agility is roughly speaking: 2 – 2 – 2 – 1. I see the significant negative coefficients again. The strength and intelligence have a negative impact on receives per game. Although the effects seem relatively small, they are not negligible. If we take the negative values seriously, then one more unit gain in speed/hand/endurance can be counteracted by two MORE units in strength and intelligence. Let’s be cynical of the sim engine – do you believe that the engine might be penalizing the overly trained players?
Touchdowns/G ~ Input variables
Significant vars: Strength (-0.006 negative again and again?), Intelligence (-0.006 negative again?), Speed (+0.02), Endurance (+0.01).
The way to read the coefficients is the same as above. Roughly speaking, the speed vs. endurance effect has a 2 : 1 ratio.
Yards/catch ~ Input variables
Significant vars: Speed (+0.35). Only speed matters.
Okay, let me give a summary table here. The three characters (speed, hands, and endurance) are certainly the most important factors for a great WR. The three factors matter to the majority of the four evaluation metrics with a different importance ranking ratio, as summarized in the table.
Regression Coefficients Table
Second, the impacts of many variables, which I expected to be useful, turn out to be minor. The impacts of the purchased traits (deep threat, athlete, slot receiving) are insignificant in all four regressions. If we relax the P-value threshold a bit from 10% to 20%, these traits may be seen as significant factors for only touchdowns per game. Third, maybe it is some model misspecification problem (some potential limitation which I will discuss later), the negative impacts of strength and intelligence are confusing. Does it mean that the best strategy, once the player wins more than 1,000 TPE, is to put them in the bank instead of squandering them on the muscles and the brain of a WR? Maybe a lean and stupid WR is the ultimate WR that can penetrate any defense line in this sim league.
Actions & Optimizations – The regression coefficients reveal the importance of attributes; however, they cannot tell WRs the optimum allocation of the TPEs to attributes. The TPE investment is essentially a benefit-cost analysis (BCA), which means that any general BCA framework can be applied. We need to combine the regression coefficients (benefit side) with the cost side to find the optimum.
A simple rule-of-thumb is whatever we learnt from economics/optimization 101: marginal benefits = marginal costs. Without any complicated math formulation, we should expect the best TPE allocation strategy happens at the point where the ratio of benefits equals to the ratio of costs. Okay, it is too abstract. Let me provide an example.
If you target to optimize only yards per game, and if (big if) you trust the regression coefficients, then your optimum solution should always maintain a cost ratio at 3 : 1 : 2 for speed, hands, and endurance.
e.g. 290 TPEs – speed (90), hands (70), endurance (80). At this point, the marginal cost ratio of developing the three attributes are: 15 TPE : 5 TPE : 10 TPE (3 : 1 : 2), the same as the marginal benefit ratio. Therefore, the attributes reach the optimum conditioning on the 290 TPEs.
e.g. 485 TPEs - speed (94), hands (79), endurance (89). Again, at this point, the marginal cost ratio of the three attributes are still: 15 TPE : 5 TPE : 10 TPE (3 : 1 : 2). But at this moment, if you get another 100 TPEs, how to spend them? You get to 95, 80, and 90 first, and then just check the ratio between the marginal cost vs. benefits. The marginal cost becomes 25 TPE : 10 TPE : 15 TPE (5 : 2 : 3). Since the endurance is capped at 90, the optimum strategy is to keep developing speed to target speed (100), hands (80), and endurance (90).
If you only target to optimize receivings per game, and if (again big if) you trust the regression coefficients, then your optimum solution should always maintain a cost ratio at 2:2:2:1 for speed, hands, endurance, and agility.
e.g. 270 TPEs – speed (80), hands (80), endurance (80), agility (70). At this point, the marginal cost ratio of developing the four attributes are: 10 : 10: 10 : 5 (2: 2: 2: 1), so we have reached the optimum.
e.g. 570 TPEs - speed (90), hands (90), endurance (90), agility (80). The development from (80,80,80,70) to (90,90,90,80) follows an optimum trajectory, but the cost ratio structure of further developing them starts to change to 15 TPE: 15 TPE: 15 TPE: 10 TPE. Then what to do? The principle is find the attribute with the lowest cost (relative to the benefit ratio) to start – in this case, you can develop any one of the speed, hands, and endurance, but not agility.
There is no uniquely optimum trajectory of attribute developments for a WR, since the four evaluation metrics relate to the player attributes with different payoff structures. However, if I really push myself to make a decision, then the relative importance should rank as: Speed > Hands ~ Endurance > Agility. The impacts of the other attributes seem quite ambiguous to me, at least based on the analysis with the four performance attributes.
To be honest, this principle of matching the ratio of marginal benefits and costs is already done in an intuitive way by most players. For example, every WR seeks to maximize the speed first, and then develops the relatively cheap attributes at the same time. It is essentially the same as the optimality principle (marginal benefits = marginal costs). Here my results just provide clear quantitative values to guide the exact development trajectory for an optimum WR build.
[TECH notes] The story above is super simplified. Here are the caveats. Linear regression assumption seems unrealistic. The attributes’ influence on WR performance might be nonlinear: one extra speed from 99 to 100 should be much more effective than the speed increase from 49 to 50. The three WR archetypes should have different development trajectories. It may be wise for a speed receiver to keep developing speed, while the slot receiver should prioritize hands. The work above assumes that the three archetypes’ optimum builds are not significantly different, which might not be true. Many variables are ignored – e.g. different teams have different styles: some teams always have the rushing plays, and some teams always have the running plays. The highest WR performance tends to happen to the latter teams, not the former. The regressions ignore WRs’ teammates’ capacity. e.g. If you are blessed with a great QB and a great offensive line, then there is no doubt that the WR’s performance is much better than the average. Again, sample size is so limited, so I can only show the average results and focus on the most important factors. Even on the evaluation metrics side, I have to ignore many other variables, e.g. fouls, incomplete catches, drops, etc.
Q3 - SIM Randomness
Using all the variables as the input variables, then we know the importance of the SIM randomness in influencing WR performance. Regression is: performance ~ TPEs + attributes + … + all the variables I have collected. The R square tells us the variance of the systematic determination (players’ TPEs and efforts) vs. random residuals (SIM randomness). The R square values are:
Regressions using linear variables
Yards per game
Linear reg (linear terms) - R square is: 0.561
Linear reg (linear terms) - R square is: 0.505
Receivings per game
Linear reg (linear terms) - R square is: 0.593
Linear reg (linear terms) - R square is: 0.542
Touchdowns per game
Linear reg (linear terms) - R square is: 0.402
Linear reg (linear terms) - R square is: 0.327
Yards per catch
Linear reg (linear terms) - R square is: 0.299
Linear reg (linear terms) - R square is: 0.211
Regressions using quadratic transformation
Yards per game
Linear reg (quadratic terms) - R square is: 0.604
Linear reg (quadratic terms) - R square is: 0.516
Receivings per game
Linear reg (quadratic terms) - R square is: 0.681
Linear reg (quadratic terms) - R square is: 0.610
Touchdowns per game
Linear reg (quadratic terms) - R square is: 0.450
Linear reg (quadratic terms) - R square is: 0.329
Yards per catch
Linear reg (quadratic terms) - R square is: 0.358
Linear reg (quadratic terms) - R square is: 0.216
Conclusion: the systematic part can explain about 30%~70% of the total variance, which means that the SIM randomness accounts for another 30%~70%. I am utterly shocked by the beautiful symmetry in the numbers. All our efforts combined (hard training, wise decisions of allocating TPEs, choosing teams, etc.) account for 50% of our final achievements, while the pure SIM randomness contributes to the other 50%. Lesson – it seems fair for a player to blame the SIM for 50% of the time. But if the player blames the SIM on a daily base, probably the person needs to note that at least the other 50% of the fate is on his own hand.
High-Level Take-Aways
Well, I babbled on and on about data, techniques, numbers, and models. I hope the readers are not lost, so some high-level takeaways help:
• Training efforts (TPEs) determine about 20% of the performance, wise decisions (allocation of TPEs, etc) account for about 30%, and SIM randomness controls the last 50%.
• Hard training efforts (TPEs) pay off. But the TPEs’ effects start to plateau after 1,000 TPEs.
• Hard to provide a uniquely optimum development trajectory for a WR. But roughly speaking, Speed > Hands ~ Endurance > Agility.
• No evidence to suggest the importance of the purchased traits, competitiveness, wisdom, strength, although I cannot rule out some modeling errors.
• A big question – do you believe that a player can be over-trained? E.g. the sim penalizes a super wise and strong WR. If so, then we should save the TPEs in the bank rather than spend them on useless attributes. Target a fast, lean, and stupid WR; maybe that is the ultimate WR in the league.
(Words: about 3600)
Main Questions - A WR’s (or any player's) performance can be contributed by roughly speaking, three groups of factors: (1) training efforts, (2) wise decisions, and (3) sim randomness. Training efforts are measured by TPEs – if a player has more TPEs, it is much more likely for the player to deliver great performance. The obstacle on this front is our laziness – we need to motivate ourselves to work harder to earn TPEs. Conditioning on the same level of TPEs, wise decisions matter, including the allocation of TPEs, decisions of moving to different teams, etc. The obstacle to wise decisions is not hard working, but information asymmetry – the sim engine knows the whole system while individual players do not have enough information to make wise decisions. The last thing is about sim randomness, which everyone rightfully criticizes when they lose a game (although people tend to show less gratefulness to the sim randomness when they win a game…). Okay then, this structure leads to the main questions of this article:
• To what extent, do the hard work (TPEs) itself determine a WR’s performance?
• What is the optimum allocation of TPEs to attributes to maximize a WR’s performance?
• To what extent, is a single WR’s performance determined by the SIM randomness?
Aha, sometimes I like to babble on and on about statistical principles, modeling caveats, and many other technical things. I will mark them as TECH notes, so feel free to skip them.
Data sources – I downloaded the user attributes and receiving data sets at four different time points (S27 Week 2, S27 Week 5, S27 Week 16, and S28 Week 8). Putting them together, the sample size is relatively large (about 120 observations). The data set provides the information that is after S27, when the sim league changed its simulation engine. Hence all the insights are useful for this new sim environment. I downloaded players’ attributes and WRs’ performances from two different web pages, and as a result, I did some coding to match the player names of the two web pages. Somehow I was quite lazy with matching players’ names precisely, so the matching process loses some data points. (Well, it is not my fault because the two data sources use different formats for players’ names. Matching English words is a boring task.) [TECH Notes] - The ratio between model and data complexity determines the reliability of the model results. A classical result follows the central limit theorem with a root N convergence rate. The rule-of-thumb in practice is roughly 1/10 ratio – meaning that 1 variable requires about 10 observations. Intuitively, the league has around 14*3=42 WRs. When we have only 40 observations, the number of parameters can only be limited to about four, which is extremely limited. But the potential determinants are huge: about ten different build attributes, three archetypes, fourteen teams, teams’ formation and strategies, among many factors. A small data set cannot afford an in-depth analysis of so many factors. The solution is to collect multiple cross-sectional data from the Sim league, expanding the cross-sectional data to a panel. It is not a perfect solution, but slightly better.
Descriptive data
Here is the nutshell of data summary:
Total number of WRs: 42
Total number of observations: 117
Distribution of three Archetypes
The data set covers 42 unique WRs, and in total it has 117 observations. Out of the 117 observations, there are 58 speed receivers, 39 slot receivers, and 20 possession receivers. The speed receivers are the most popular archetype in the WR category. I guess it is because every WR looks forward to the happiness of running past the whole defensive lines and winning TDs.
Distribution of key variables: The variables can be divided into two categories: inputs and outputs. There are four output variables to measure the performance of WRs: yards per game (Yards_G), receivings per game (No_G), touchdowns per game (TD_G), and yards per receiving (Yards_No). The average values of the four variables are around 65 yards per game, 5 receivings per game, 0.35 TD per game, and 12 yards per catch. The mean values are close to the 50% percentiles, and the distributions are close to a symmetric one (like Gaussian/Binomial type).
Description of four evaluation metrics
Description of four evaluation metrics
The input variables are a long list of things: current TPEs, strength, agility, intelligence, speed, hands, endurance, competitiveness, deep threat (trait – 0/1 variable), athlete (trait – 0/1 variable), and slot receiving (trait – 0/1 variable). Just reading through the table below – the average values of the variables are: TPE – 803, Strength – 64, agility – 81, intelligence – 65, speed – 95, hand – 86, endurance – 77, competitiveness – 63, deep threat (27% purchased), athlete (14% purchased), and slot receiving (29% purchased). These average values do not suggest anything a good WR should or should not do. It just implies that if you have not earned 803 TPEs, you are not an average WR yet (like me) : (
Description of eleven input variables
The distributional information of the variables reveals richer details about the sim and cost structure. For example, the speed attribute has clear three-modal distributions, peaking at 90, 95, and 100. The three peaks are the maximum speed limits for the three archetypes (speed receiver at 100, slot receiver at 95, and possession receiver at 90). Hence every WR has either already reached the max speed limit or on the way to maximizing the speed. Hands is the same thing: clear peaks at 80, 85, and 90. It is because the speed WR’s hand limit is at 85 and the slot receiver’s limit is at 90. Sometimes the attribute distribution reveals the cost structure. Many attributes peak at 70 or 80, since the cost of investing in TPEs follows a piecewise linear form. In fact, the analysis below will show that all these practices are quite reasonable. Crowd-sourced wisdom through folklores already converges to the optimum strategy. A cool thing to note.
Distribution of eleven input variables
I ignored many factors. On the performance evaluation side, many other metrics can be valuable: completion rates, # of drops, fouls, etc. For example, Slate told me that fouls might heavily depend on players’ intelligence, and fouls have a negative impact on the whole team. Many input variables are also ignored, particularly those related to teams. Since the teams have a very different balance between running vs. rushing plays, individual WRs’ performance must heavily depend on the team strategies. However, I need to ignore the 14 discrete variables (teams), because of modeling concerns. Even if I can get some conclusions about the team strategies’ impacts on WR performance, the conclusions are typically not generalizable - Teams can change strategies from season to season, so it does not make sense to suggest all the WRs go to one team (e.g. Yeti, with the highest throwing attempt this season).
Okay. Let’s start to answer the three questions.
Q1 - TPE Impacts
The first thing is about the effectiveness of TPEs on WR performance. The conclusion is, well, hard work pays off. The figure below shows that TPEs are highly and positively correlated with TPEs. The upward trends are very clear when the TPEs are plotted with all four evaluation metrics (yards per game, receivings per game, TDs per game, and yard per receiving). Suppose a player grows from 400 TPE to 1400, the person’s performance is expected to increase from 30 yards/G to 100 yards/G, and from 4 receives/G to 7 receives/G, etc. It is not a surprise that hard work pays off. However, there are two notes.
One – the orange curves fit better than the linear ones. There is a clear saturation effect after the TPE reaches around 1000 TPEs. It seems quite reasonable, since all the most useful attributes already reach the maximum using 1000 TPEs,. The additional TPEs beyond 1000 seem to have a limited payoff in performance.
Four evaluation metrics with TPEs
Second, to what extent the hard work itself determines the performance? This question relies on the R square of the fitting functions (printed below). The R squares range between 10% to 35%, depending on the exact metric. Overall, we can conclude that hard training (TPEs) itself can determine around 10% to 35% of a WR’s performance. Is it high or low? Well…hard to judge, but it suggests that other 65%~90% of the performance depends on players’ wise decisions of allocating TPEs, choosing right teams, etc. as well as the SIM randomness.
------
Current TPEs fitting yards per game
Linear reg (linear terms) - R square is: 0.340
Linear reg (quadratic terms) - R square is: 0.360
------
Current TPEs fitting receivings per game
Linear reg (linear terms) - R square is: 0.241
Linear reg (quadratic terms) - R square is: 0.278
------
Current TPEs fitting touchdowns per game
Linear reg (linear terms) - R square is: 0.133
Linear reg (quadratic terms) - R square is: 0.132
------
Current TPEs fitting yards per receiving
Linear reg (linear terms) - R square is: 0.117
Linear reg (quadratic terms) - R square is: 0.121
Q2 – Wise WR Decisions with Regression and Optimization
The question about how to wisely allocate the TPEs needs to be answered with regressions and optimizations. We need to understand how each attribute contributes to the performance, and how to combine the benefits and the costs to figure out the most efficient use of TPEs. The regressions use the four evaluation metrics as outputs, and other independent variables as inputs. Here are the four results.
Yards/G ~ Input variables
Significant Vars: Strength (-0.5 negative?), Speed (+2.8), Hand (+0.8), Endurance (+1.44).
I obtained quite reasonable results except for strength. One unit of improvement in speed, hands, and endurance leads to 2.8, 0.8, and 1.44 units increase of yards/G. Hence to improve yards/G, players should prioritize speed > endurance > hands. The ratio of the three coefficients are roughly speaking: 3 : 1 : 2. This ratio represents the benefit of the attributes, which needs to be combined with the costs to identify the optimum TPE allocation strategy. I cannot believe the strength has a negative impact, although the -0.5 is not very large and P-value is not very small. The negativity of the coefficient might relate to the differences of the three archetypes – speed receivers tend to get high yards/G, and probably they won’t develop their strength much.
Receiving/G ~ Input variables
Significant vars: Strength (-0.05 negative again?), Agility (+0.05), Intelligence (-0.04), Speed (+0.09), Hand (+0.11), Endurance (+0.11).
Receivings per game positively depend on agility, speed, hands, and endurance. The results are also very intuitive. Compared to yards per game in which agility does not to matter, here agility starts to matter. Similar to yards, the three main attributes (speed, hands, and endurance) are still the most important variables. The relative importance of speed, hand, endurance, and agility is roughly speaking: 2 – 2 – 2 – 1. I see the significant negative coefficients again. The strength and intelligence have a negative impact on receives per game. Although the effects seem relatively small, they are not negligible. If we take the negative values seriously, then one more unit gain in speed/hand/endurance can be counteracted by two MORE units in strength and intelligence. Let’s be cynical of the sim engine – do you believe that the engine might be penalizing the overly trained players?
Touchdowns/G ~ Input variables
Significant vars: Strength (-0.006 negative again and again?), Intelligence (-0.006 negative again?), Speed (+0.02), Endurance (+0.01).
The way to read the coefficients is the same as above. Roughly speaking, the speed vs. endurance effect has a 2 : 1 ratio.
Yards/catch ~ Input variables
Significant vars: Speed (+0.35). Only speed matters.
Okay, let me give a summary table here. The three characters (speed, hands, and endurance) are certainly the most important factors for a great WR. The three factors matter to the majority of the four evaluation metrics with a different importance ranking ratio, as summarized in the table.
Regression Coefficients Table
Second, the impacts of many variables, which I expected to be useful, turn out to be minor. The impacts of the purchased traits (deep threat, athlete, slot receiving) are insignificant in all four regressions. If we relax the P-value threshold a bit from 10% to 20%, these traits may be seen as significant factors for only touchdowns per game. Third, maybe it is some model misspecification problem (some potential limitation which I will discuss later), the negative impacts of strength and intelligence are confusing. Does it mean that the best strategy, once the player wins more than 1,000 TPE, is to put them in the bank instead of squandering them on the muscles and the brain of a WR? Maybe a lean and stupid WR is the ultimate WR that can penetrate any defense line in this sim league.
Actions & Optimizations – The regression coefficients reveal the importance of attributes; however, they cannot tell WRs the optimum allocation of the TPEs to attributes. The TPE investment is essentially a benefit-cost analysis (BCA), which means that any general BCA framework can be applied. We need to combine the regression coefficients (benefit side) with the cost side to find the optimum.
A simple rule-of-thumb is whatever we learnt from economics/optimization 101: marginal benefits = marginal costs. Without any complicated math formulation, we should expect the best TPE allocation strategy happens at the point where the ratio of benefits equals to the ratio of costs. Okay, it is too abstract. Let me provide an example.
If you target to optimize only yards per game, and if (big if) you trust the regression coefficients, then your optimum solution should always maintain a cost ratio at 3 : 1 : 2 for speed, hands, and endurance.
e.g. 290 TPEs – speed (90), hands (70), endurance (80). At this point, the marginal cost ratio of developing the three attributes are: 15 TPE : 5 TPE : 10 TPE (3 : 1 : 2), the same as the marginal benefit ratio. Therefore, the attributes reach the optimum conditioning on the 290 TPEs.
e.g. 485 TPEs - speed (94), hands (79), endurance (89). Again, at this point, the marginal cost ratio of the three attributes are still: 15 TPE : 5 TPE : 10 TPE (3 : 1 : 2). But at this moment, if you get another 100 TPEs, how to spend them? You get to 95, 80, and 90 first, and then just check the ratio between the marginal cost vs. benefits. The marginal cost becomes 25 TPE : 10 TPE : 15 TPE (5 : 2 : 3). Since the endurance is capped at 90, the optimum strategy is to keep developing speed to target speed (100), hands (80), and endurance (90).
If you only target to optimize receivings per game, and if (again big if) you trust the regression coefficients, then your optimum solution should always maintain a cost ratio at 2:2:2:1 for speed, hands, endurance, and agility.
e.g. 270 TPEs – speed (80), hands (80), endurance (80), agility (70). At this point, the marginal cost ratio of developing the four attributes are: 10 : 10: 10 : 5 (2: 2: 2: 1), so we have reached the optimum.
e.g. 570 TPEs - speed (90), hands (90), endurance (90), agility (80). The development from (80,80,80,70) to (90,90,90,80) follows an optimum trajectory, but the cost ratio structure of further developing them starts to change to 15 TPE: 15 TPE: 15 TPE: 10 TPE. Then what to do? The principle is find the attribute with the lowest cost (relative to the benefit ratio) to start – in this case, you can develop any one of the speed, hands, and endurance, but not agility.
There is no uniquely optimum trajectory of attribute developments for a WR, since the four evaluation metrics relate to the player attributes with different payoff structures. However, if I really push myself to make a decision, then the relative importance should rank as: Speed > Hands ~ Endurance > Agility. The impacts of the other attributes seem quite ambiguous to me, at least based on the analysis with the four performance attributes.
To be honest, this principle of matching the ratio of marginal benefits and costs is already done in an intuitive way by most players. For example, every WR seeks to maximize the speed first, and then develops the relatively cheap attributes at the same time. It is essentially the same as the optimality principle (marginal benefits = marginal costs). Here my results just provide clear quantitative values to guide the exact development trajectory for an optimum WR build.
[TECH notes] The story above is super simplified. Here are the caveats. Linear regression assumption seems unrealistic. The attributes’ influence on WR performance might be nonlinear: one extra speed from 99 to 100 should be much more effective than the speed increase from 49 to 50. The three WR archetypes should have different development trajectories. It may be wise for a speed receiver to keep developing speed, while the slot receiver should prioritize hands. The work above assumes that the three archetypes’ optimum builds are not significantly different, which might not be true. Many variables are ignored – e.g. different teams have different styles: some teams always have the rushing plays, and some teams always have the running plays. The highest WR performance tends to happen to the latter teams, not the former. The regressions ignore WRs’ teammates’ capacity. e.g. If you are blessed with a great QB and a great offensive line, then there is no doubt that the WR’s performance is much better than the average. Again, sample size is so limited, so I can only show the average results and focus on the most important factors. Even on the evaluation metrics side, I have to ignore many other variables, e.g. fouls, incomplete catches, drops, etc.
Q3 - SIM Randomness
Using all the variables as the input variables, then we know the importance of the SIM randomness in influencing WR performance. Regression is: performance ~ TPEs + attributes + … + all the variables I have collected. The R square tells us the variance of the systematic determination (players’ TPEs and efforts) vs. random residuals (SIM randomness). The R square values are:
Regressions using linear variables
Yards per game
Linear reg (linear terms) - R square is: 0.561
Linear reg (linear terms) - R square is: 0.505
Receivings per game
Linear reg (linear terms) - R square is: 0.593
Linear reg (linear terms) - R square is: 0.542
Touchdowns per game
Linear reg (linear terms) - R square is: 0.402
Linear reg (linear terms) - R square is: 0.327
Yards per catch
Linear reg (linear terms) - R square is: 0.299
Linear reg (linear terms) - R square is: 0.211
Regressions using quadratic transformation
Yards per game
Linear reg (quadratic terms) - R square is: 0.604
Linear reg (quadratic terms) - R square is: 0.516
Receivings per game
Linear reg (quadratic terms) - R square is: 0.681
Linear reg (quadratic terms) - R square is: 0.610
Touchdowns per game
Linear reg (quadratic terms) - R square is: 0.450
Linear reg (quadratic terms) - R square is: 0.329
Yards per catch
Linear reg (quadratic terms) - R square is: 0.358
Linear reg (quadratic terms) - R square is: 0.216
Conclusion: the systematic part can explain about 30%~70% of the total variance, which means that the SIM randomness accounts for another 30%~70%. I am utterly shocked by the beautiful symmetry in the numbers. All our efforts combined (hard training, wise decisions of allocating TPEs, choosing teams, etc.) account for 50% of our final achievements, while the pure SIM randomness contributes to the other 50%. Lesson – it seems fair for a player to blame the SIM for 50% of the time. But if the player blames the SIM on a daily base, probably the person needs to note that at least the other 50% of the fate is on his own hand.
High-Level Take-Aways
Well, I babbled on and on about data, techniques, numbers, and models. I hope the readers are not lost, so some high-level takeaways help:
• Training efforts (TPEs) determine about 20% of the performance, wise decisions (allocation of TPEs, etc) account for about 30%, and SIM randomness controls the last 50%.
• Hard training efforts (TPEs) pay off. But the TPEs’ effects start to plateau after 1,000 TPEs.
• Hard to provide a uniquely optimum development trajectory for a WR. But roughly speaking, Speed > Hands ~ Endurance > Agility.
• No evidence to suggest the importance of the purchased traits, competitiveness, wisdom, strength, although I cannot rule out some modeling errors.
• A big question – do you believe that a player can be over-trained? E.g. the sim penalizes a super wise and strong WR. If so, then we should save the TPEs in the bank rather than spend them on useless attributes. Target a fast, lean, and stupid WR; maybe that is the ultimate WR in the league.
(Words: about 3600)