(07-08-2020, 04:29 PM)sakrosankt Wrote:I just based it solely from what you have as a starting point from the previous week. I'm not sure how the teams start their testing round, but if you take the sim-file from the week before and run the test for the next game - without changing any strategy or tempo - you have a baseline for both teams facing each other the next week. I took this win% as the baseline, and just used the final sim-file and ran the matchup again. The two values are compared and give the final result.
The only thing I think is doubtful here is the 1 sim with 500 tests, which gives enough room for variance. So I could have ran a matchup multiple times to get an average for the win%. But in the end that was too much time I didn't want to spend on running the sim, that's why I just took 1 time 500 runs for the matchup with the sim file from the week before, and 1 time 500 runs for the matchup with the final sim-file for the week.
I mean there are larger methodological issues than that. For example, some teams face tough matchups - not just in a pure roster perspective which might normalize some in how you did compare one week to the next, but more so in a testing capacity. If you consistently face teams that are making smart changes the impact of your changes will be less in terms of raw % but may be more impactful on the margins. That’s the first big issue I see.
The second is that adding raw % is inherently a volatile way of looking at data. If you’re looking at getting the best perspective you’re typically looking for an average - and trying to normalize for outliers at that. So dropping a high and low and averaging over a large enough data set and horizon. The logic being that even the best teams have bad performances, but still may be the best. The Redskins could come out and beat the 49ers or Chiefs one game and skew the metrics in a small enough sample, but you’d hardly say the Redskins are consistently (i.e. on average) better than those teams. And then when comparing the Chiefs and the 49ers against each other or other quality teams you’d weights the Redskins game differently in an analysis depending on how much of an outlier it was from a normal performance - i.e. how a coaching staff, team, players, etc do on average. This does not just mean in that isolated matchup.