10. Prediction Evaluation | Election Prediction Blog

Post-election Reflection

It has now been almost two weeks since Donald Trump won the 2024 election, thus the time has come to evaluate my prediction model and attempt to understand where I fell short. To begin, here is another look at my prediction:

My 2024 Prediction Model

My prediction model utilized an elastic-net regularized regression of four key predictor variables: the interaction effect between Q2 GDP growth and incumbency, the latest FiveThirtyEight weighted poll averages, two-party voteshare lagged one cycle, and two-party vote share lagged two cycles. In choosing these variables I hoped to build a relatively simple model which could apply to an election as unusual as this one. For a full explanation of each variable as well as my justification for its inclusion, refer to my week nine blog post linked here.

I utilized an elastic-net regularized regression to control for multicollinearity and overfitting while maintaining my model’s simplicity, transparency, and generalizability. Once again, my week nine blog post provides a more in-depth explaination for my choice of model.

Finally, since my two-party vote share summed to over 100%, I normalized my predictions for both Democratic and Republican two-party vote share against each other. My results are summarized below:

State	Democratic Prediction	Republican Prediction	Winner
Arizona	49.01264	50.98736	Trump
California	63.23344	36.76656	Harris
Colorado	55.64653	44.35347	Harris
Florida	47.18219	52.81781	Trump
Georgia	49.32191	50.67809	Trump
Indiana	41.54380	58.45620	Trump
Maryland	64.87031	35.12969	Harris
Massachusetts	64.61116	35.38884	Harris
Michigan	50.52614	49.47386	Harris
Minnesota	52.82845	47.17155	Harris
Missouri	42.63210	57.36790	Trump
Montana	40.92078	59.07922	Trump
Nebraska	40.98905	59.01095	Trump
Nevada	50.10748	49.89252	Harris
New Hampshire	52.52822	47.47178	Harris
New Mexico	53.92110	46.07890	Harris
New York	59.57050	40.42950	Harris
North Carolina	49.18136	50.81864	Trump
Ohio	45.70131	54.29869	Trump
Pennsylvania	49.96810	50.03190	Trump
South Carolina	43.84510	56.15490	Trump
Texas	46.29662	53.70338	Trump
Utah	38.53670	61.46330	Trump
Virginia	53.59095	46.40905	Harris
Washington	59.57236	40.42764	Harris
Wisconsin	50.20683	49.79317	Harris

My model predicted that Harris would win Michigan, Nevada, and Wisconsin on a slim margin, while Trump would have won Arizona, Pennsylvania, Georgia, and North Carolina on similarly slim margin. This would have resulted in a Trump victory with 281 electors while Harris had 257 electors.

The 2024 Election Results

Early Wednesday morning the 2024 election was called for Former President Donald Trump after he managed to clinch Georgia, North Carolina, and Pennsylvania. In the hours that followed, the rest of the swing states — Michigan, Wisconsin, Arizona, and Nevada — were called for Donald Trump as well. This means that Trump won the election with 312 electoral votes compared to Vice President Harris’ 226 electoral votes.

The overwhelming shift toward Donald Trump and the Republican party was seen across the United States, as is outlined in the map included below:

Comparing My Predictions with the Actual Results

I predicted that Trump would narrowly beat Harris with 281 electoral votes by winning Georgia, North Carolina, and Pennsylvania by incredibly slim margins. Instead, Trump managed to win every one of the seven swing states. The table below compares my predicted vote margins with the reality in each swing state.

State	2024 Republican Vote Share	Republican Prediction	2024 Democratic Vote Share	Democratic Prediction	Winner
georgia	51.12155	50.67809	48.87845	49.32191	Trump
pennsylvania	51.01765	50.03190	48.98235	49.96810	Trump
nevada	51.62171	49.89252	48.37829	50.10748	Trump
michigan	50.69844	49.47386	49.30156	50.52614	Trump
arizona	52.84747	50.98736	47.15253	49.01264	Trump
wisconsin	50.46844	49.79317	49.53156	50.20683	Trump
north carolina	51.70111	50.81864	48.29889	49.18136	Trump

To better understand the magnitude and direction of this error, I have outlined my model’s prediction error clearly in this table:

State	2024 Democratic Vote Share	Democratic Prediction	Democratic Vote Share Error
georgia	48.87845	49.32191	-0.4434620
pennsylvania	48.98235	49.96810	-0.9857550
nevada	48.37829	50.10748	-1.7291875
michigan	49.30156	50.52614	-1.2245852
arizona	47.15253	49.01264	-1.8601160
wisconsin	49.53156	50.20683	-0.6752730
north carolina	48.29889	49.18136	-0.8824625

As the tables above indicate, my model overpredicted for Harris in every swing state.

Georgia: My model correctly predicted that Georgia would swing for Trump. Furthermore, the margin of error between my prediction model and the actual result was only 0.44 percentage points: a relatively small under prediction for Trump. My working hypothesis as to why my model was more accurate in Georgia is that polls in Georgia have been incredibly accurate in the Trump era, especially compared to polling in other states. According to FiveThirtyEight, Georgia’s average polling miss across 188 polls since 2016 was only 3.4 points (FiveThirtyEight, 2024). I believe this enhanced accuracy in accounting for the Trump-effect increased the accuracy of Georgia’s prediction.
Arizona: My model also asserted that Arizona would vote for Trump. This state, however, saw the largest margin of error wherein my model overpredicted Harris’s performance by a margin of 1.86 points. This result calls into question my earlier hypothesis as FiveThirtyEight asserts that Arizona’s average polling miss across 162 polls since 2016 was 3.7 points, only 0.3 points higher than Georgia (FiveThirtyEight, 2024).
North Carolina: I accurately predicted that North Carolina would swing for Trump. The margin for error in this prediction was 0.88 percentage points, placing it on the smaller end in terms of magnitude. My prediction is that the North Carolina model was more accurate as North Carolina is the only state of that seven that has swung for former President Trump in both 2016 and 2020, thus the lagged vote share may predict a Trump victory more accurately.
Pennsylvania: Pennsylvania is the most interesting state in my prediction model as I had originally predicted it would swing for Harris when using the latest FiveThirtyEight poll averages taken about three weeks before election day. When I updated my poll averages the day before the election, however, Pennsylvania flipped for Trump on a narrow margin of 0.003 percentage points. This prediction ended up being correct, though it still overpredicted Harris’s vote share by 0.99 percentage points. Three weeks before election day, FiveThirtyEight’s latest aggregate poll showed Harris above Trump with a margin of 0.7 percentage points (FineThirtyEight, 2024). In the data I used for my model, Trump had shifted to having a 0.2 percentage point margin above Harris. This 0.9 point shift in the polls led Pennsylvania to swing for a different candidate, thus indicating how important poll accuracy was for the success of my model.
Michigan: Michigan was one of the three states that I incorrectly predicted would swing for Harris. My model had overpredicted Harris’s vote margin by 1.22 percentage points. My hypothesis is that both Michigan and Wisconsin were biased by polling data which overpredicted Harris’s vote margin as Michigan and Wisconsin were thought of as part of the “Blue Wall.” Blue Wall states are those which have voted for a Democratic candidate from 1992 to 2012 and again in 2020, with the one exception being Trump’s election in 2016. According to FiveThirtyEight, Michigan’s average polling miss across 133 polls since 2016 was 5.4 points (FiveThirtyEight, 2024). Furthermore, Michigan polling from FiveThirtyEight had Harris at a 0.9 percentage point margin above Trump.
Wisconsin: Like Michigan, my model also incorrectly predicted a Harris victory in Wisconsin. The prediction error in Wisconsin was around 0.67 percentage points, so around half of the error in Michigan. I am unsure why my model was more accurate in Wisconsin than in Michigan, especially because Wisconsin’s average polling miss across 81 polls since 2016 was 5.6 points (FiveThirtyEight, 2024).
Nevada: Finally, my last incorrect prediction was in Nevada where I overpredicted Harris’s margin by 1.73 percentage points. Nevada had voted for the Democratic candidate in 2016 and 2020, thus the lagged vote share most likely biased my model. Indeed, Nevada’s polls are thought of as the most accurate — the state’s average polling miss across 86 polls since 2016 was 3.3 points (FiveThirtyEight, 2024). I am unsure as to why Nevada would have behaved differently in this election and will explore that (and my other hypotheses) in depth below.

Calculating My Model’s Error

To better understand the error in my prediction model, I will utilize both graphics and calculations of bias and error.

The graphic above indicates that for every state except Washington, Utah, and Colorado, Trump outperformed the model. This indicates systemic model bias in favor of Harris which I will attempt to understand and explore below.

Table 1: Model Evaluation Metrics for 2024 Democratic Vote Share Predictions
Metric	Value
Bias	-1.4362
Mean Squared Error (MSE)	3.3204
Root Mean Squared Error (RMSE)	1.8222

Bias: Numerically, my model was biased to overpredict Harris’s vote share by an average of 1.44 percentage points. This is curious as when calculating the mean error of my final prediction model using out-of-sample cross-validation I had calculated that the mean error for Democrats is around -0.25, indicating my model tended to slightly underestimate Democrat performance. This systematic overestimation of Harris’s performance is unique to this election cycle and an undesirable outcome of my model. I will attempt to understand what happened below.
Mean Squared Error: The average squared error is 3.32. This value is sensitive to larger errors because squaring amplifies the impact of larger deviations. This value is difficult to interpret in the context of vote margin since it is squared; however, I use it here to compare it to the square of the mean out-of-sample error of my model calculated using cross-validation. The mean error for a Democratic prediction prior to 2024 was -0.25 making the mean squared error 0.0625. Indeed, the mean squared error of my 2024 prediction is nearly 54 times that of my model prior to 2024.
Root Mean Squared Error: To interpret this error in terms of vote percentage, root mean squared error was calculated to be 1.82. This means that my model was, on average, 1.82 percentage points off from the actual Democratic vote share.

Correcting my Model to be Fully Saturated

After completing my prediction model, I realized that when including the interaction effect between Q2 GDP growth and incumbency, it is essential to also include the individual variables (Q2 GDP growth and incumbency) to ensure that the model is fully saturated. A fully saturated model accounts for the main effects of each variable alongside their interaction, which is a standard practice in data science to avoid omitting key predictors. I have decided to address this oversight by updating my model and re-evaluating its errors. This will ensure that any hypotheses I make regarding the model’s inaccuracies are based on a version of the model that adheres to proper data science principles and standards.

State	Democratic Prediction	Republican Prediction	Winner
Arizona	47.93258	52.06742	Trump
Georgia	48.32418	51.67582	Trump
Michigan	49.37494	50.62506	Trump
Nevada	48.76123	51.23877	Trump
North Carolina	47.95686	52.04314	Trump
Pennsylvania	48.74113	51.25887	Trump
Wisconsin	48.92938	51.07062	Trump

As is displayed above, had I correctly included Q2 GDP Growth, Incumbency, and the interaction effect between them in a fully saturated version of my model I would have accurately predicted a Trump victory in all seven swing states.

Re-Calculating Model Error

To evaluate the performance of this adjusted model I will one again run through the steps I took above when initially evaluating my model. First, I have included a comparison of my predictions and the actual 2024 election results.

State	2024 Republican Vote Share	Republican Prediction	2024 Democratic Vote Share	Democratic Prediction	Winner
Georgia	51.12155	51.67582	48.87845	48.32418	Trump
Pennsylvania	51.01765	51.25887	48.98235	48.74113	Trump
Nevada	51.62171	51.23877	48.37829	48.76123	Trump
Michigan	50.69844	50.62506	49.30156	49.37494	Trump
Arizona	52.84747	52.06742	47.15253	47.93258	Trump
Wisconsin	50.46844	51.07062	49.53156	48.92938	Trump
North Carolina	51.70111	52.04314	48.29889	47.95686	Trump

To better understand the magnitude and direction of this error, I have outlined my model’s prediction error clearly in this table:

State	2024 Democratic Vote Share	Democratic Prediction	Democratic Vote Share Error
Georgia	48.87845	48.32418	0.5542656
Pennsylvania	48.98235	48.74113	0.2412200
Nevada	48.37829	48.76123	-0.3829450
Michigan	49.30156	49.37494	-0.0733865
Arizona	47.15253	47.93258	-0.7800495
Wisconsin	49.53156	48.92938	0.6021822
North Carolina	48.29889	47.95686	0.3420382

The corrected version of my model accurately predicted a Trump landslide in each of the swing states with the error in democratic vote share equal to less than a percentage point. To visually represent my model’s bias for each state, I plotted them on a map similar to the one pictured above.

As can be seen in the map above, my model does not appear to be biased in either direction as around half of the predicted states were bias toward Trump and the other half were biased toward Harris. This is very different than my less-saturated model above which systematically underpredicted for Trump. Furthermore, the lighter shades of both colors represents the model’s enhanced accuracy in predicting the actual Democratic two-party vote share in each state.

To better understand the bias and error of this adjusted model mathematically, I conducted the same calculations as above on the fully-saturated version of my model here:

Table 2: Model Evaluation Metrics for 2024 Democratic Vote Share Predictions
Metric	Value
Bias	-0.2736
Mean Squared Error (MSE)	1.2188
Root Mean Squared Error (RMSE)	1.1040

Bias: Numerically, my corrected model was biased to overpredict Harris’s vote share by an average of 0.27 percentage points. As is indicated by the different directions of error in each of my swing state predictions, this model does not have the systematic bias towards Harris that the less-saturated version of my model did. I will, however, continue to explore why this version of my model still slightly overpredicts for Harris.
Mean Squared Error: The average squared error for my corrected model is 1.21 which is over 2 points less than that of my less-saturated model presented above. This implies that a fully saturated version of my model more accurately predicted the actual 2024 vote share.
Root Mean Squared Error: To interpret this error in terms of vote percentage, root mean squared error was calculated to be 1.1. This means that my model was, on average, 1.1 percentage points off from the actual Democratic vote share. This is 0.7 percentage points lower than the error of my less-saturated model.

Understanding Why My Corrected Model Succeeded (and Why it Failed)

Directly after the election, scholars, journalists, forecasters, and politicians scrambled to make sense of Trump’s overwhelming success in what was thought of to be an incredibly close election. I will now attempt to understand why my initial model fell short of accurately predicting the results, why I believe my corrected model had more success in accurately predicting the results, and how my model’s accuracy could be improved in the future.

Why did my Initial Model Fail?

As I mentioned previously, my initial model fell short of accurately predicting the results of the 2024 election because it was not fully saturated. While I included the interaction effect between Q2 GDP growth and incumbency, I failed to include these two variables independently as part of the model. This omission is problematic because, without the individual terms for Q2 GDP growth and incumbency, the interaction effect cannot be properly interpreted. By excluding these base variables, my model lacked the saturation needed to accurately capture the unique influence of the interaction effect on the predictions. This inevitably distorted the relationships between my variables and ultimately contributed to inaccuracies in my results.

This error stemmed partly from a lack of understanding, but I also have to admit that my decision to code Q2 GDP growth and incumbency solely as an interaction effect was influenced by the model’s initial predictions. When I included both variables independently, the model predicted a landslide victory for Donald Trump—a result that I found difficult to reconcile with my personal expectations and assumptions. In an effort to make the model’s predictions align more closely with what I believed to be plausible, I restructured it to focus on the interaction effect alone. This adjustment produced results where Harris won key states like Michigan, Wisconsin, and Nevada—an outcome I found easier to accept. In hindsight, this decision compromised the model’s integrity, as it prioritized aligning with my beliefs over adhering to principles of data science which I claimed to be committed to.

I believe that this is a learning opportunity for me as I move forward in the field of data science and forecasting. Remaining unbiased is incredibly difficult; however, without understanding and removing my personal bias, I may end up ruining what would have been an accurate prediction.

Why was my Fully Saturated Model so Accurate?

Predictive Variables: When deciding on which variables I wanted to include in my final model, I relied heavily on three understandings —

Elections are often determined based on fundamental variables, of which the interaction between incumbency and Q2 GDP growth is often the most obvious economic trend to which voters hold an incumbent accountable (Blog Post 2).
Polls right before the election are most accurate way to take the temperature of the American electorate. Given that my final prediction was to be made the day before election day, I assumed that focusing most intensely on the latest poll averages would allow the model to successfully include an understanding of how Americans felt immediately before the election (Blog Post 3).
Donald Trump’s impact on American politics has been so unprecedented and volatile that, for the past two election cycles, it has repeatedly confounded pollsters, forecasters, and political scientists. Traditional models often seem unable to fully capture the psychology and motivations of Trump voters, leading to a consistent underprediction and underpolling of his support across the United States. In the context of my predictive election model, I believe that including lagged vote share—which, for the 2024 election, incorporates data from the 2016 and 2020 elections—offers a way to partially account for this effect. By reflecting historical patterns of voter behavior, particularly the surge in support Trump garnered in those years, lagged vote share provides an important anchor in history.

By focusing on a few, carefully chosen variables—each supported by a clear theoretical rationale—I reduced the risk of overfitting and avoided unnecessary complexity. This approach allowed the model to focus on the most critical factors while maintaining a degree of generalizability across states and election cycles. Additionally, the integration of variables like incumbency, GDP growth, and lagged vote share offered a layered understanding of voter behavior, blending long-term trends with immediate dynamics.

Procedure: I believe my model’s accuracy in predicting the 2024 results lies in its simplicity, transparency, and reliance on fundamental variables. By using a regularized regression model instead of more complex ensemble or machine learning methods, I minimized opportunities for model bias and ensured that my predictions were directly interpretable and grounded in clear assumptions. The simplicity of this approach not only reduced overfitting but also enhanced transparency and generalizability, making the model accessible to broader audiences and its predictions robust despite limited data. In an era of skepticism about election integrity, this straightforward approach underscores the importance of transparency and interpretability in fostering public confidence in election forecasts.

How Can I Improve my Model for the Future?

1. Harris Specific Bias: One of my primary assumptions when creating my model was that Harris was a fundamentally different candidate than Biden. Based on this assumption, I specifically excluded Biden’s presidential approval ratings and aggregate polling data from when Biden was the candidate. Furthermore, I avoided over-including economic fundamental interaction effects, which I believed were more attributable to Biden’s presidency than to Harris’s candidacy.

However, in the aftermath of the election, I have come to realize that many voters in the American electorate cast their votes as a referendum on the Biden administration and the Democratic Party as a whole. This dissatisfaction, particularly with Biden’s handling of key issues, appears to have influenced voter decisions in ways that my model failed to account for (New York Times, 2024). My model did not sufficiently incorporate variables that could capture how dissatisfaction with a sitting president shapes voting behavior, such as June presidential approval ratings or economic indicators that voters might attribute to Biden’s policies. This led to a blind spot in understanding how incumbency effects—typically driven by voter perceptions of the incumbent administration—would influence the election results.

Were I to revise this model, I would prioritize including June presidential approval ratings as well as key economic indicators like inflation, wage growth, and unemployment. These variables could help capture voter dissatisfaction with the incumbent administration and better account for the complex role of incumbency and presidential approval in shaping election outcomes.

2. Polling: After three consecutive Trump elections, scholars can now confidently conclude that Trump underpolls. As the Brookings institute concluded days after the election, “today’s polling instruments and techniques are not well designed to measure the kinds of voters for which Trump has a distinctive appeal” (Brookings, 2024). While I am not a pollster and, thus, cannot fix these biases for the sake of my forecast, I believe there are two solutions so as to account for this bias.

First, in my model I utilized FiveThirtyEight’s weighted poll averages. FiveThirtyEight allocates weights based on the historical credibility and accuracy of each pollster, thus attempting to exclude biased and untrustworthy polls. While this is important in theory, in this past election forecasters like Real Clear Politics, who do not weight their polls, performed better (Real Clear Politics, 2024). I hypothesize that by heavily relying on FiveThirtyEight’s weighted averages, my model may have unintentionally incorporated some of the biases introduced by poll herding, ultimately narrowing the scope of information used to predict voter sentiment. To better understand and, if needed, ultimately correct this bias, I would have liked to test a model using the raw latest poll averages. Unfortunately, this data is not readily available so I cannot perform that test.

Second, I believe that unless this systematic error is corrected, it may be important in the future to rely less heavily on polling data. In a model which only includes four variables, polling plays an important role in determining my prediction. If this systematic error persists, however, it may be important to underweight the latest polls in future models so as to minimize the bias transfered through this variable. This can be done either by including more fundamental variables, which I have emphasized elsewhere, or by using a weighting technique.

3. The Economy: To prioritize simplicity and account for the Harris effect (as described above), I refrained from including economic fundamentals beyond the interaction between Q2 GDP growth and incumbency. While this decision helped my model avoid overfitting and multicollinearity, it may have overlooked other critical economic factors that shaped voter behavior. For instance, Q2 GDP growth captures macroeconomic trends but might fail to reflect the tangible, personal economic pressures voters experience. This raises the question of how strongly individuals feel GDP growth compared to more immediate and visible economic indicators like inflation, wage growth, and/or unemployment.

One of the dominant narratives in this election has been how Biden-era inflation eroded Harris’s prospects, highlighting the importance of personal economic circumstances in shaping voter decisions (New York Times, 2024). To refine my model, I would be curious to re-run it with the inclusion of personal economic indicators such as inflation, wage growth, and unemployment—potentially incorporating these as interaction effects. These variables may provide a more nuanced understanding of voter temperament and how economic realities translate into electoral outcomes.

Final Reflections

I am incredibly proud of the performance of my final model, which, when fully saturated, was able to correctly predict the 2024 election results. By incorporating key variables such as incumbency, GDP growth, lagged vote share, and polling data, I was able to provide a robust prediction that aligned closely with the actual outcome. This success underscores the importance of including fundamental variables, minimizing bias, and maintaining a transparent and interpretable approach to forecasting.

However, in reflecting on this process, I believe there is still room for improvement. Moving forward, it will be important to better understand and account for polling biases and the role of economic indicators in shaping voter sentiment. Given the complexities of modern elections, refining how I incorporate these factors will be essential to improving the accuracy and reliability of my predictions in future models.

Notes

All code above is accessible via Github.

Sources

“Trump’s Win: Why It Wasn’t a Surprise.” The New York Times, November 7, 2024. https://www.nytimes.com/2024/11/07/us/politics/trump-win-election-harris.html.

The Polls Underestimated Trump’s Support Again." Brookings, November 2024. https://www.brookings.edu/articles/the-polls-underestimated-trumps-support-again/.

“2016 Presidential Election Results.” 270toWin, 2024. https://www.270towin.com/2016_Election/.

2020 Presidential Election Results." 270toWin, 2024. https://www.270towin.com/.

The States with the Most Accurate Polls in the 2024 Election." ABC News, 2024. https://abcnews.go.com/538/states-accurate-polls/story?id=115108709.

Data Sources

US Presidential Election Popular Vote Data from 1948-2020 provided by the course. Economic data from the U.S. Bureau of Economic Analysis, also provided by the course. Polling data sourced from FiveThirtyEight.

Prediction Evaluation

Mena Solomon