Data Aggregation, Simpson's Paradox, and Bad Causal Analysis

Last week I noticed this tweet from Chris Hayes:

At first glance, this looks like a remarkable result. According to this, during the last 35 years Democratic presidents have “created” about 25x the jobs that Republican presidents have. Chris goes on to insinuate that while this is a “small sample size” the effect must not be random, there has to be some impact from the president. Unfortunately for Chris, we do not live in an elected monarchy, there are three branches of government, and Simon, who he cites, failed to include the one that controls tax policy and budgets in his analysis. Additionally, those Democratic presidents had a couple more years to add those jobs in that time period. We have 16 years of Republican presidency and 19 of Democratic presidencies. So how could we take this framework and make it more useful, by accounting for the whole of elected government as well as the time frame they existed? Well, I went ahead and did that, below you’ll see a chart that shows annualized net jobs created by each form of elected government control. You’ll see Democratic and Republican trifectas, where they controlled each arm of elected government. You’ll see Democratic and Republican congresses, when they controlled the legislature but not the presidency. And you’ll see what happened when Congress was under split house/senate control. I’ve annualized the net jobs by dividing net jobs created by the number of years each situation existed from 1989 until the end of 2023. 

If you wanted to draw specious conclusions from this chart, you’d say that the best case scenario for the American economy is a Democratic president with full Republican control of the house and senate. You’d also say the situation we want to avoid at all costs is a split congress with a Republican president (for what it’s worth, the only time this situation happened in this sample was when the world shut down for COVID). If you threw away divided Congress as a dysfunctional government with little control over anything, you’d conclude that Democrats should never control Congress unless they control the Presidency as well. If you explain away the poor performance of a split Congress with a Republican president or a Democratic Congress because both of those were impacted by massive economic shocks, you’d conclude there is not much difference between Republican control and Democratic control. 

None of these are great conclusions, but they are better informed than the conclusions Chris drew from Simon’s analysis. The problem with Simon’s analysis which I corrected slightly, though inadequately, is that when we aggregate data like this we lose fidelity and context. In Chris’s case, this is an excellent illustration of how Simpson’s Paradox can cause you to draw false conclusions about data. The wiki gives way more detail on Simpson’s paradox, but ultimately it is about how analytical results can change as we account for more confounding variables, and it illustrates how we can draw incorrect conclusions by failing to account for important variables. When analyzing data, we need to account for the many variables that could impact our outcome. In Chris’s case he looks at only one variable within the government. In mine I look at a couple more variables, all still within the government, and still an insufficient amount to account for, broadly, the state of the economy. The conspiracy minded will try to tell you that the media does this on purpose, and there certainly are outlets that are, effectively, propaganda operations. However, I think most big media miss the mark on this kind of analysis because of the way they deliver information. The format demands bite size segments of information, which makes it impossible to explain a large, interconnected set of variables that could impact an outcome. The audience itself demands these simplistic conclusions. Fans of Chris Hayes got exactly what they wanted from Simon’s chart, proof positive that a Democratic president is the surefire best way to job growth. 

Political media and politicians themselves believe that political acts are the primary drivers of economic outcomes. The media must believe it because that’s what they report on, and the stakes have to be high. Politicians must believe it because their egos are immense and they need to constantly think about and be reminded of their own importance. But this is clearly untrue. The broad history of the world shows us that growth economics is mostly a matter of resource access and innovation. 

The reality is that, under our system, the government has fairly limited control over the month to month or year to year state of the economy. As of right now the key economic policies that the Biden administration passed, the CHIPS act and Jobs act, have barely generated any economic activity. If they are successfully implemented, whoever wins in November will surely reap the monthly jobs figure effects of those laws, whether they were involved in creating the laws or not. Government can certainly work to avoid or react to catastrophe, and there were meaningful lessons learned from our response to the financial crisis in 2007 and our response to the COVID disruption in 2020. In the latter case aid was deployed much more rapidly, and widely, than the targeted aid that was used in 2007-2008. The results speak for themselves, and it’s clear we’ll learn even more lessons going forward about similar situations that might mitigate the inflationary impacts of our COVID response.

I am not suggesting that you conclude that the current government is not involved in the long term growth economy at all, but the real impact of government on the economy has a substantial lag. Decisions the government makes can create conditions where innovation can thrive or fail, but the government’s most proximate, day-to-day impact on the economy is as a proactive or reactive backstop for economic failures. Rather than attributing the latest jobs data to the current president, we should recognize that those results are an amalgamation of thousands of variables that drive the economy, some that were set in motion years and years ago. There are plenty of other outcomes they deserve credit or criticism for. 

I hope we can, someday, move away from these simplistic, and false causal analyses of the monthly jobs report. It would create more space for richer conversations about the philosophy behind the way we govern and organize to create economic value for everyone.

Subscribe to Signal-Noise Ratio

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe