You are on page 1of 6

Welcome back. We're still in week six. This is lecture two of number five.

In this section, we're going to talk about this idea of counterfactuals, a critically important idea for understanding causal inference in social epidemiology. I want to point out that there's lots of math that one can do. Algebra, if you wish, of counterfactual thinking. I'm not going to go into that in this course. If you want to Google it, you can look at some of the citations and some of the literature for reading for this course, but here I want to be as straightforward and simple as possible. Don't forget about the reading by the way. So first some history. Where does this idea of counterfactual come from? Well, it comes, most people say from the early work of the philosopher David Hume. He was writing in the 1700s, and Hume was working on the problem of scientific induction. How do we know anything causes something in the world? So, and relatedly, how do we predict things in the world? So the sun rising yesterday and the day before, does it tell us anything about the sun rising tomorrow? Hume said no. In other parts of his work, while working on the very same problem, he talked about the key element of causation being the so-called but for condition. But for the virus, you would not have gotten the chest cold. But for the car accident, you would not have been injured. But for this course, you wouldn't learn a lot about social epidemiology. The idea of but for was really not addressed for a couple of hundred years, but it was picked up again by the philosopher David Lewis, and published in his important 1973 text called counterfactuals. Now this is an intense text of a logician, someone who works on logic. And you can read it if you wish. It's very difficult. But the first line of this important text always struck me interesting.

And he writes, if kangaroos had no tails, would they topple over? That is, but for the tail, would the kangaroo topple? And from there he goes on to lots of mathematics and other things. But here is the philosopher Lewis saying hey, counterfactuals are important, here's how to think about them. These ideas were picked up and advanced in a stastis-, statistical framework, in a research framework, by the statistician who was at Harvard, Don Rubin. And Rubin advanced these ideas and talked about it, one of the first people in modern statistics to address causation and causality. From here, Rubin talked about it in terms of potential outcomes. It's the same idea as counterfactuals, at least for purposes here. Sometimes people call this the Rubin's model. So there's lots of jargon, but it's really the same idea. Recent work, particularly in social epidemiology, is addressing what's called the closest possible world assumption. That is, how much of a different world can you imagine? Is it worth imagining a world with kangaroos and no tails? Or is that just philosophical navel-gazing? So when we talk about comparing things, or looking at a world in a different way, how far afield do we want to go? Is it worth, scientifically, imagining a world where everyone loved each other and took care of one another? That's kind of a far place to go, and not very close to the world we live in. It might be better to have a closer world imagined, where we imagine different policies. Obama-care, or the affordable care act, or something like that changing, which seems quite realistic. And then think about its impact on our health outcomes. Well let me try to offer some pictorial or illustrative demonstration of counterfactuals. The first point is to imagine a person. Here is just a black, simple stick figure. And this person is exposed to McDonald's, eats Mcdonald's And after some period of time, this person gets a little thicker. His body mass index might have gone up.

The important point about counterfactuals is we take that very same person, a black stick figure. At the very same time, we roll back the universal clock. And we imagine him in a world without McDonald's, and then we see what his BMI or body mass index was with that world without or but for McDonald's. And we see here in the imagined world, he's thinner. So is it, this suggests that in a world with McDonald's, stick figure gets thick. In a world without McDonald's, Stick Figure stays thin. The causal effect of McDonald's, since everything else is the same, is the difference in the body mass index from these two diagrams. So what we end up doing when we do causal inference is compare the stick figure with McDonald's, thicker to the stick figure without McDonald's, thinner. And if the difference in BMI is 10 units, then we say the cause McDonald's on BMI is 10 units. Of course, it's not possible, it is not possible to ima-, to observe, to see the world with black stick figure, with McDonald's. And black stick figure at the same time, and the same place, without McDonald's. That's why we say counter to fact, or counter-factual. Only one of those scenarios is true. Black stick figure had McDonald's or didn't. What we do in the real world, where we can observe things, is find another person, in this case, blue stick figure. And what we want to do is compare the world with the black stick figure and the world with the outcome of the blue stick figure And one is exposed to McDonald's. One is not. The black stick figure is heavy, the blue stick figure is not. So now we're at the critical junction. Is McDonald's causing the change in obesity? The answer is yes only if the black and blue stick figures are exchangeable or otherwise identical. If black and blue are flip flappable, exchangeable, identical, you pick the word, then we can say, yes McDonald's is causing the change in obesity. But, if say the blue stick figure exercises differently or has different metabolism, then there's another

explanation for the observed change in obesity. And that is a competing explanation, what epidemiologists call confounding. So push come to shove, this game of causal inference is all about comparisons. What we observe is, say, a black stick figure under one scenario. What we'd like is the same black stick figure in an alternative scenario, can't observe that. That's the counterfactual. So we seek a blue stick figure to substitute for the unobservable black stick figure. And to the extent that the blue and black counterfactual are exchangeable, we can draw causal conclusions, to the extent they are not exchangeable, or comparable, or flip flappable. We have confounding, or other sources of bias, and that's the rub of epidemiologic research, causal inference. When done right, finds the best counterfactual substitute for the unobservable counterfactual scenario. We see comparison groups or person who are like our treated or exposed groups in every way except for the exposure. The virus, the McDonald's, the policy. The best substitutes, as I've mentioned, are exchangeable. Lack of exchangeability is what epidemiologists calls confounding. How do we do this? How do we achieve exchangeability in the real world? The best way is through randomization. We flip a coin. Some people are treated or exposed, others are not. On average, in the long run, those two groups are statistically equivalent. That's why randomization is so important. That's why the work of Ronald Fisher is so important. In the end, it's all about the comparison group. Epidemiologists should always ask, "compared to what"? The virus had this effect compared to what? The policy had that effect compared to what? Of course, there are some limitations of counterfactuals. Just a few, and I know this is a busy slide. The whole approach of counterfactuals

tends to focus our attention on a single cause. But we all know that most health outcomes are from multiple causes. Counterfactuals also tend to focus on the manipulation of things, which leaves the question of a gender or a race effect questionable because we cannot change someone's race or gender Counterfactuals often prevent us from thinking about level of exposure. It's either exposed or unexposed. What about degrees of exposure, more of a medicine, more of the carcinogen or less. Counterfactuals are less prone to help us with that kind of problem Counterfactuals often don't illuminate the mechanism by which the change occurs. So, we know that McDonald's is causing obesity, but we don't know how exactly that happens. Counterfactuals don't fit very well with the idea of necessary and sufficient conditions, which has, for a long time, been part of the literature or thinking on causal effects. So that's a little troubling. Counterfactuals don't help us with effect heterogeneity. That is, the effect of the virus on me has this outcome. The effect of the virus on you has that outcome on someone else, yet another outcome. So it's still having an effect, but those effects vary. That's effect heterogeneity, counterfactuals struggle with that kind of question. It's not impossible, it's just more advanced work and thinking. Counterfactuals are useless at the individual level. With the data alone, I can't tell whether the virus Being brought to me is causing my cold, or whether there was something else. But at a population level, we say yes, the virus in the group creates an increase of 40% of colds or some other condition. Counterfactual, counterfactual thinking works at the group level, at least that's what we can observe. Finally, counterfactuals aren't very good for an important area of social epidemiology and that has to do with dynamics in groups and feedback loops. What do I mean? Well, does socioeconomic status cause bad

health, or does bad health cause lower socioeconomic status? The answer is both, because there's a circular, or feedback loop phenomenon. Counterfactuals aren't very helpful there. Parting thoughts when we think about causal inference. We always bring to our data some model, some theory of causation. It's not in the data itself so it's important to remember that. And as a result we have to be cautious because we need to self reflect on our own cognitive framework, our own scientific self scrutiny. The fact is we're bombarded by data, all kinds of stimuli, every moment of every day. We filter that out. Scientists filter those things out in useful ways. That's the key.

You might also like