Using a CNN and LSTM Neural Network to Predict Wins Based on Rosters
Hey all, it’s wonderful_art again and I’m back with another statistical analysis project focused on leveraging machine learning to gain insights on sim team performance. Consider this the flip-side to decompiling the sim itself and instead reverse-engineering how we can predict sim results from sim inputs. The inputs this time? Full rosters.
After a lot of work and data wrangling, this project is finally ready for publishing here. I’m excited to showcase it, as it provides a really interesting understanding of the sim through the power of neural networks and machine learning models. As I discussed in the last piece, this is part of an ongoing series of projects to predict team performances using data science and better analyze, with statistics and other advanced means, how player’s performance and builds impact the results.
To this point, I’ve been mostly focused on statistics – understanding the output of player’s performance to reverse engineer how likely it is their team would’ve succeeded. The last model took player statistics to predict wins, and it focused just on QB statistics. It was fun, but limited – it didn’t consider the rest of the team, the other team, etc.
Now, since we have player builds, it opens up a whole new avenue of interest in that we can attempt to predict values with independent variables, and more importantly, variables that should have a high correlation to performance – after all, the sim is just a computer program that uses these variables to generate results.
This project uses these player builds for a whole team, and their opponent, to predict the likelihood of a win. Through a much more advanced, multi-input machine learning model using convolutional neural network layers and long short term memory layers, it predicts with roughly 80% ‘accuracy’ (more on that below) who will win a given sim game based solely on the rosters of the two teams competing.
Its not perfect, as real-world data rarely is, but everyone should be glad it is able to perform as well as it does. It means that there is some correlation to that player you built, their teammate’s builds and their opponent’s and win success. It’s not perfect, there is often times the ‘better’ team loses, but the model is able to reverse-engineer a pretty good estimation of who should win any given game.
Cool, right?
How does it work?
First, its gather’s all the builds for a team. That is done through the index for that season, using the roster tab of the team page on the index. Fortunately, that data is accurate for that season, unfortunately it is only accurate for the last game. But, as it goes, it is a good enough approximation. Our first assumption is then that the player builds at the end of the season is a rough sketch of their overall season builds.
It only uses the top 22 builds (as determined by overall). Why top 22? Football has 11 active players on the field at any given time. With 11 offense and 11 defense, that should give us the top 22 players who ‘active’ and not mostly on the bench. Now overall isn’t the best means of actually determining who to count, but the top 22 should be a pretty good approximation. Those top 22 player builds are stored in a matrix of size (22, 15) -> 22 players with 15 attributes (position is included as a categorical variable).
Now that every team’s builds are constructed, it also pulls the schedule for each team. The schedule is stored in a pandas dataframe, and some key information is extracted. First, who they play each week is cleaned so to match the opponent’s acronym. Second, the result is cleaned into a simple 0 for a loss and 1 for a win. Thankfully, this means we are using a binary classification model that will simply predict whether each game results in a 0 or 1.
There is one more feature and hanging around sim testing for any time and you can probably guess it – whether the game was played at home or away. Nice and simple cleaning of the schedule to check if it is an away game (1) or home game (0) in a simple categorical column that is passed into the model as a third input.
Now the model itself is more complex than the last one I presented, no longer relying on dense neural network layers. Now the model has three input layers – one for the team we wish to predict on, one for their given opponent, and one for each input is a 3D matrix (1, 22, 15) and therefore we can pass into a 2D convolutional neural network layer. Essentially this convolutional layer is looking for patterns – it runs a certain pattern along a number of cells and returns, in a new matrix, whether the result was 1 or 0. In a picture, this is used to detect edges – in natural language processing, perhaps it is used to detect phrases. Convolutional neural nets have allowed for some incredible advanced in what machine learning can do, and here I am using it to look for patterns in the team’s player builds.
Next, the results of the convolutional neural layers are passed into long-short term memory – a layer that uses some mathematical tricks to hold onto the ‘memory’ (aka important values) of the data that has already passed. In this way it can ‘remember’ important features that explains the data.
At the end, the two channels of the team’s roster and the opponent’s roster are flattened, then concatenated together (added together) with the away / home designation. These pass through a dense neural layer with 1 single output – a sigmoid activation for a binary class of either 0 (loss) or 1 (win).
Sound complicated? There’s a tonne of resources you can check out to learn more about these layers and how they work. I suggest just Youtubing anything that caught your eye above, and you’ll get a nice 5-10 minute video explaining them.
The results
As you can see above, the model did a fairly good job correctly predicting results from all seasons (S5 – S24 at least). The box plot, to those unfamiliar, shows the range (the grey bars), the third quartile (top of the colored box), the median (the line in the colored box), and the first quartile (the bottom of the colored box). It’s a useful way to check the different between categories - in this case, the ‘true wins’ and the ‘true losses’ vs the predictions.
How about just for this season? Let’s take a look at both the overall results, and how each team has performed. Note that the result of -1 means the game is yet to be played.
Much less clean – although it’s done an admirable job predicting losses, it is underpredicting wins – tending to allow more false negatives than false positives. That said, the median of the true win category is still roughly at the third quartile of the true loss category, so it is not completely inaccurate.
This breaks the results down by team this season instead, where the green dots are true wins, the yellow dots true losses and the blue dots the remaining game for each team.
What about the games that went the complete opposite way of the prediction? Here is a breakdown of the league’s biggest steals (low prediction, ended up as a win) and biggest fumbles (high prediction, ended up as a loss).
Biggest steals:
Biggest fumbles:
So who has the best coaches and the worst sim luck? We can compare actual wins versus predicted wins.
Team – Actual Wins – Predicted Wins
Colorado Yeti – 12 – 7.93
Orange County Otters – 11 – 8.11
Sarasota Sailfish – 10 – 6.72
San Jose Sabercats – 9 – 6.02
Honolulu Hahalua – 8 – 7.68
Arizona Outlaws – 8 – 6.52
Baltimore Hawks – 8 – 6.42
Chicago Butchers – 8 – 6.07
Philadelphia Liberty – 7 – 7.04
Austin Copperheads – 6 – 7.89
New Orleans Second Line – 6 – 7.68
Yellowknife Wraiths – 6 – 6.12
Berlin Fire Salamanders – 3 – 6.82
New York Silverbacks – 3 – 5.307
The best coaches, according to the model, are Sarasota with roughly 3 more wins than the model would’ve projected, San Jose as well, and of course the league leaders Colorado and Orange County who, despite still having some of the highest predictions (right around Honolulu) performed greater. A reasonable outcome given the fact the model tends to view everyone as roughly equal.
The worst sim luck goes to our expansion teams, but specifically Berlin, who – according to the model – should have faired much better this season (at least 4 wins better). Austin and New Orleans were also predicted to do better than so far.
We can actually do the same analysis for all teams from S5 through S24 in a historical analysis of who was predicted to better than they have done – across all seasons.
So you can see that despite Orange County having more wins than predicted, they’re still predicted to have the most wins by a pretty safe margin. Philadelphia is the oldest team with a predicted win count greater than their actual win count. And Austin and Chicago are the tale of two completely different stories – Austin (slightly) overperforming the model’s predictions and Chicago significantly underperforming how good it has expected them to be.
And, finally, we can review tonight’s game and ask the model who is predicted to win in each game.
Some future ideas for the model include testing roster composition, and I plan on incorporating it into some draft analysis on team drafting – how player additions and subtractions might impact overall win probability. And while I’m happy with 75-80% area under the curve (aka accuracy), there is certainly room for growth in the model, including testing new layers and machine learning models. Hope you enjoyed!
Hey all, it’s wonderful_art again and I’m back with another statistical analysis project focused on leveraging machine learning to gain insights on sim team performance. Consider this the flip-side to decompiling the sim itself and instead reverse-engineering how we can predict sim results from sim inputs. The inputs this time? Full rosters.
After a lot of work and data wrangling, this project is finally ready for publishing here. I’m excited to showcase it, as it provides a really interesting understanding of the sim through the power of neural networks and machine learning models. As I discussed in the last piece, this is part of an ongoing series of projects to predict team performances using data science and better analyze, with statistics and other advanced means, how player’s performance and builds impact the results.
To this point, I’ve been mostly focused on statistics – understanding the output of player’s performance to reverse engineer how likely it is their team would’ve succeeded. The last model took player statistics to predict wins, and it focused just on QB statistics. It was fun, but limited – it didn’t consider the rest of the team, the other team, etc.
Now, since we have player builds, it opens up a whole new avenue of interest in that we can attempt to predict values with independent variables, and more importantly, variables that should have a high correlation to performance – after all, the sim is just a computer program that uses these variables to generate results.
This project uses these player builds for a whole team, and their opponent, to predict the likelihood of a win. Through a much more advanced, multi-input machine learning model using convolutional neural network layers and long short term memory layers, it predicts with roughly 80% ‘accuracy’ (more on that below) who will win a given sim game based solely on the rosters of the two teams competing.
Its not perfect, as real-world data rarely is, but everyone should be glad it is able to perform as well as it does. It means that there is some correlation to that player you built, their teammate’s builds and their opponent’s and win success. It’s not perfect, there is often times the ‘better’ team loses, but the model is able to reverse-engineer a pretty good estimation of who should win any given game.
Cool, right?
How does it work?
First, its gather’s all the builds for a team. That is done through the index for that season, using the roster tab of the team page on the index. Fortunately, that data is accurate for that season, unfortunately it is only accurate for the last game. But, as it goes, it is a good enough approximation. Our first assumption is then that the player builds at the end of the season is a rough sketch of their overall season builds.
It only uses the top 22 builds (as determined by overall). Why top 22? Football has 11 active players on the field at any given time. With 11 offense and 11 defense, that should give us the top 22 players who ‘active’ and not mostly on the bench. Now overall isn’t the best means of actually determining who to count, but the top 22 should be a pretty good approximation. Those top 22 player builds are stored in a matrix of size (22, 15) -> 22 players with 15 attributes (position is included as a categorical variable).
Now that every team’s builds are constructed, it also pulls the schedule for each team. The schedule is stored in a pandas dataframe, and some key information is extracted. First, who they play each week is cleaned so to match the opponent’s acronym. Second, the result is cleaned into a simple 0 for a loss and 1 for a win. Thankfully, this means we are using a binary classification model that will simply predict whether each game results in a 0 or 1.
There is one more feature and hanging around sim testing for any time and you can probably guess it – whether the game was played at home or away. Nice and simple cleaning of the schedule to check if it is an away game (1) or home game (0) in a simple categorical column that is passed into the model as a third input.
Now the model itself is more complex than the last one I presented, no longer relying on dense neural network layers. Now the model has three input layers – one for the team we wish to predict on, one for their given opponent, and one for each input is a 3D matrix (1, 22, 15) and therefore we can pass into a 2D convolutional neural network layer. Essentially this convolutional layer is looking for patterns – it runs a certain pattern along a number of cells and returns, in a new matrix, whether the result was 1 or 0. In a picture, this is used to detect edges – in natural language processing, perhaps it is used to detect phrases. Convolutional neural nets have allowed for some incredible advanced in what machine learning can do, and here I am using it to look for patterns in the team’s player builds.
Next, the results of the convolutional neural layers are passed into long-short term memory – a layer that uses some mathematical tricks to hold onto the ‘memory’ (aka important values) of the data that has already passed. In this way it can ‘remember’ important features that explains the data.
At the end, the two channels of the team’s roster and the opponent’s roster are flattened, then concatenated together (added together) with the away / home designation. These pass through a dense neural layer with 1 single output – a sigmoid activation for a binary class of either 0 (loss) or 1 (win).
Sound complicated? There’s a tonne of resources you can check out to learn more about these layers and how they work. I suggest just Youtubing anything that caught your eye above, and you’ll get a nice 5-10 minute video explaining them.
The results
As you can see above, the model did a fairly good job correctly predicting results from all seasons (S5 – S24 at least). The box plot, to those unfamiliar, shows the range (the grey bars), the third quartile (top of the colored box), the median (the line in the colored box), and the first quartile (the bottom of the colored box). It’s a useful way to check the different between categories - in this case, the ‘true wins’ and the ‘true losses’ vs the predictions.
How about just for this season? Let’s take a look at both the overall results, and how each team has performed. Note that the result of -1 means the game is yet to be played.
Much less clean – although it’s done an admirable job predicting losses, it is underpredicting wins – tending to allow more false negatives than false positives. That said, the median of the true win category is still roughly at the third quartile of the true loss category, so it is not completely inaccurate.
This breaks the results down by team this season instead, where the green dots are true wins, the yellow dots true losses and the blue dots the remaining game for each team.
What about the games that went the complete opposite way of the prediction? Here is a breakdown of the league’s biggest steals (low prediction, ended up as a win) and biggest fumbles (high prediction, ended up as a loss).
Biggest steals:
- Philadelphia Liberty @ Austin Copperheads (Week 2). PHI predicted to win at 0.1969. PHI wins by a score of 30-27.
- Sarasota Sailfish @ Arizona Outlaws (Week 9). SAR predicted to win at 0.2057. SAR wins by a score of 37-29.
- Arizona Outlaws @ Orange County Otters (Week 13). AZ predicted to win at 0.2094. AZ wins by a score of 29-19.
- Berlin Fire Salamanders @ Philadelphia Liberty (Week 4). BER predicted to win at 0.2130. BER wins by a score of 33-31.
Biggest fumbles:
- Yellowknife Wraiths @ Baltimore Hawks (Week 11). BAL predicted to win at 0.8408. YKW ends up winning by a score of 9-3.
- Yellowknife Wraiths @ Philadelphia Liberty (Week 6). PHI predicted to win at 0.8359. YKW ends up winning by a score of 28-24.
- Yellowknife Wraiths @ Berlin Fire Salamanders (Week 13). BER predicted to win at 0.8351. YKW ends up winning by a score of 33-30.
- Baltimore Hawks @ Honolulu Hahalua (Week 3). HON predicted to win at 0.7904. BAL ends up winning by a score of 30-24.
- Baltimore Hawks @ Berlin Fire Salamanders (Week 5). BER predicted to win at 0.7069. BAL ends up winning by a score of 23-14.
So who has the best coaches and the worst sim luck? We can compare actual wins versus predicted wins.
Team – Actual Wins – Predicted Wins
Colorado Yeti – 12 – 7.93
Orange County Otters – 11 – 8.11
Sarasota Sailfish – 10 – 6.72
San Jose Sabercats – 9 – 6.02
Honolulu Hahalua – 8 – 7.68
Arizona Outlaws – 8 – 6.52
Baltimore Hawks – 8 – 6.42
Chicago Butchers – 8 – 6.07
Philadelphia Liberty – 7 – 7.04
Austin Copperheads – 6 – 7.89
New Orleans Second Line – 6 – 7.68
Yellowknife Wraiths – 6 – 6.12
Berlin Fire Salamanders – 3 – 6.82
New York Silverbacks – 3 – 5.307
The best coaches, according to the model, are Sarasota with roughly 3 more wins than the model would’ve projected, San Jose as well, and of course the league leaders Colorado and Orange County who, despite still having some of the highest predictions (right around Honolulu) performed greater. A reasonable outcome given the fact the model tends to view everyone as roughly equal.
The worst sim luck goes to our expansion teams, but specifically Berlin, who – according to the model – should have faired much better this season (at least 4 wins better). Austin and New Orleans were also predicted to do better than so far.
We can actually do the same analysis for all teams from S5 through S24 in a historical analysis of who was predicted to better than they have done – across all seasons.
So you can see that despite Orange County having more wins than predicted, they’re still predicted to have the most wins by a pretty safe margin. Philadelphia is the oldest team with a predicted win count greater than their actual win count. And Austin and Chicago are the tale of two completely different stories – Austin (slightly) overperforming the model’s predictions and Chicago significantly underperforming how good it has expected them to be.
And, finally, we can review tonight’s game and ask the model who is predicted to win in each game.
- Orange County Otters (0.3016) @ Austin Copperheads (0.5965). The model favors Austin at home by roughly 29%.
- San Jose Sabercats (0.3574) @ New York Silverbacks (0.4818). Slim margin here as the model favors New York at home by just 13%.
- Honolulu Hahalua (0.2959) @ Arizona Outlaws(0.5002). A 21% margin favors the home team in Arizona.
- New Orleans Second Line (0.4484) @ Berlin Fire Salamanders (0.5057). Berlin vs NOLA is the closest game in the slot, with just 6% differentiating their win probability.
- Colorado Yeti (0.2514) @ Philadelphia Liberty (0.4500). An interesting mix of probabilities, as Colorado only is predicted to have a 25% chance of winning, but the Liberty are under 50% themselves. A 20% margin for Philly. It’s interesting to note that, for whatever reason, this Philadelphia game was predicted as Colorado’s lowest predicted win probability for the entire season.
- Sarasota Sailfish (0.4243) @ Baltimore Hawks (0.5866). An interesting result for the opposite reason – the Sailfish have a decent shot at winning (42%) but it is still a 16% margin for Baltimore.
- Chicago Butchers (0.5726) @ Yellowknife Wraiths (0.6498). The second closest margin between teams, both are technically predicted to win – but Yellowknife holds a slight edge by 7%.
Some future ideas for the model include testing roster composition, and I plan on incorporating it into some draft analysis on team drafting – how player additions and subtractions might impact overall win probability. And while I’m happy with 75-80% area under the curve (aka accuracy), there is certainly room for growth in the model, including testing new layers and machine learning models. Hope you enjoyed!
PHP Code:
Words: 2159 and research, data vis, etc