Statistics and Causal Inference by Paul Holland is a great paper!
He writes:
No causation without manipulation. Not everything can be a cause; in particular, attributes of units are never causes. Only variables that can be manipulated, referred to as “treatments” or “interventions”, can be considered true causes. Attributes that cannot be manipulated, such as inherent characteristics (e.g., gender, race, or personality traits), cannot be causes within the causal inference framework.
[…]
To return to the question of what can be a cause let me consider three examples of statements that involve the word cause but that vary in its exact usage.
- She did well on the exam because she is a woman.
- She did well on the exam because she studied for it.
- She did well on the exam because she was coached by her teacher.
These statements, even though they are perfectly understandable English sentences, vary in the meaning of the “because” in each. In each, the effect, using the term loosely, is the same-doing well on an exam. The causes, again using the term loosely, are different. In (A) the “cause” is ascribed to an attribute she possesses. In (B) the “cause” is ascribed to some voluntary activity she performed, and in (C) it is ascribed to an activity that was imposed on her.
An attribute cannot be a cause in an experiment, because the notion of potential exposability does not apply to it. The only way for an attribute to change its value is for the unit to change in some way and no longer be the same unit. Statements of “causation” that involve attributes as “causes” are always statements of association between the values of an attribute and a response variable across the units in a population. In (A) all that is meant is that the performance of women on the exam exceeds, in some sense, that of men.
Examples of the confusion between attributes and causes fill the social science literature. Sans and Stronkhorst (1984) gave the following example of a causal hypothesis: “Scholastic achievement affects the choice of secondary school” (p. 13). These authors clearly intended for this hypothesis to state that an attribute of a student (i.e., scores on tests, performance in primary school) can cause (i.e., affect) the student’s choice of a particular type of secondary school. It is difficult to conceive of how scholastic achievement could be a treatment in an experiment and, therefore, be a “cause” in the sense used in this article. A somewhat stronger statement of my point was given by Kempthorne (1978, p. 15): “It is epistemological nonsense to talk about one trait of an individual causing or determining another trait of the individual.”
At the other extreme is Example (C). This is easily interpreted in terms of the model. The interpretation is that had she not been coached by her teacher she would not have done as well as she did. It implies a comparison between the responses to two causes, even though this comparison is not explicitly stated.
Example (B) is just one of many types of examples in which the applicability of the model is not absolutely clear, and it shows one reason why arguments over what constitutes a proper causal inference can rage without any definitive resolution.
In (B) the problem arises because of the voluntary aspect of the supposed cause-studying for the exam. It is not clear that we could expose a person to studying or not in any verifiable sense. We might be able to prevent her from studying, but that would change the sense of (B) to something much more like (C). We could operationally define studying as so many hours of “nose in book,” but that just defines an attribute we could measure on a subject. In my opinion the application of the model to statement (B) is problematical and not easily resolved. The voluntary nature of much of human activity makes causal statements about these activities difficult in many cases.
The voluntary aspect of the “cause” in (B) is not the only source of difficulty in deciding on the applicability of Rubin’s model to specific problems. It is, however, a common source of difficulty.
The general problem, I think, is in deciding when something is an attribute of units and when it is a came that can act on units. In the former case all that can be discussed is association, whereas in the latter case it is possible, at least, to contemplate measuring causal effects.
Holland’s argument is relevant in the debate about the impacts of social media on mental health (and many other debates in social science).
If social media use is to be considered a cause of mental health outcomes, it would need to be something we can manipulate, an intervention. This would mean we can systematically vary exposure to social media. However, just as with the studying example (B) in Holland’s argument, there are limitations to “exposing” individuals to varying degrees of social media. Since social media engagement is voluntary, researchers can’t fully control for the ways individuals use it. We might be able to prevent individuals from using social media, but then we are changing a lot of other things in her life, from incorporating an “authoritarian” (you can’t use it) aspect to changing the way she might socialize. We might prevent everyone in a society, or in a group (e.g., natural experiment in a town, in a classroom), to use social media, but then we are also changing the fundamental rules of the game in that society.
If social media use is treated as an attribute, we’re recognizing it as a characteristic or tendency. For instance, some people may have an inherent predisposition to engage in social media, shaped by personality traits or life circumstances (like extroversion or social isolation). Social media engagement becomes an expression of these pre-existing attributes rather than a cause of changes in mental health. Attributes cannot serve as causal factors.
Social media use is voluntary and depends on individual motivations, habits, and contexts. If a teenager uses social media heavily due to feelings of loneliness or social anxiety, is social media use the cause of poor mental health, or is it merely a response to pre-existing issues? What we see is a cycle of association influenced by both pre-existing attributes and the social media experience. Identifying the effect of social media use on mental health is hopeless.
Again, in Holland’s words: “We could operationally define studying as so many hours of “nose in book,” [or time spent on social media] but that just defines an attribute we could measure on a subject. In my opinion the application of the model to statement (B) is problematical and not easily resolved. The voluntary nature of much of human activity makes causal statements about these activities difficult in many cases.”
A lot of people like to point to social science as the kind of activity that helps us estimate this effect, even though it’s hard to isolate causation. They’ll say something like: “sure it’s thorny but it’s still worth pursuing because understanding potential relationships between social media use and mental health can inform public health initiatives, shape digital platform policies, and guide individuals in making more mindful choices. Despite the limitations in establishing causation, social science research can highlight patterns and associations that reveal potential risk factors or protective factors. By identifying these trends, researchers and policymakers can still offer practical recommendations or interventions to mitigate possible negative impacts of social media on mental health, even if the exact causal mechanisms remain complex or uncertain.”
In a series of three excellent posts, Stephen Wild says:
The answer here isn’t to throw out those RCTs. They do tell us something, even if it’s not what we think it is (or, at the very least, what we seem to be arguing over). But it does mean that we need to think carefully about our estimand when we conduct RCTs, and it means designing RCTs that target the estimand we want. I want to be clear that I don’t know what those RCTs should look like. But there are lots of clever and competent people who do. Let’s fund them and let them do their job.
The posts are excellent because they clarify and make explicit assumptions that have been obscured by people like Jon Haidt. I more or less agree. I agree we should fund RCTs and we should fund social science. It’s better to have something than nothing. It’s good, for example, to study what happens to a classroom when they don’t use TikTok for a month. But I disagree that we can actually estimate the estimand we care about, which is the effect of social media on mental health.
There are other difficulties:
It’s very unclear social media is a monolith. Listening to EDM music on YouTube is different than watching videos of Andrew Tate on TikTok.
There’s obviously massive heterogeneity in the effects (e.g., teenagers vs director of sales at Walmart using social media extensively for work).
It’s all mental. If you are a Buddhist monk, social media has no effect. You are the kind of person who has developed the ethos not to be affected by all the BS you see. Of course it’s an empirical question, maybe I’m naive and deluded and Buddhist monks also have a tendency to measure their self-worth against curated, idealized portrayals of others’ lives. I’d be surprised, you get the point.
Another point on mental causal mechanisms. Observational research is often justified by the success story of cigarette research (we all know Fisher attacked those who used the association between smoking and lung cancer as evidence of a “causal” link). This is very different. With physical causal mechanisms, there are no ideas (e.g., measure self-worth against curated ideal) involved; the “ego” is not involved. In the case of social media on mental health, it’s all about ideas. Causation is in the logical realm of reasons. The “ego” is involved. There’s nothing physical involved (I’ll believe physicalist reductionism when I see studies that show large and robust effects, like for cigarettes and cancer, which in my view means never). We have a cycle of associations of attributes and voluntary use. Any talk about “causal effects” is, I’m very sorry to say, hopeless.
I also have problems with self-reports. Personally, I genuinely have no idea how to answer questions like in 10-item depression indexes. I don’t have much energy? Feeling unhappy, sad, or depressed? I worry a lot? I could put myself anywhere on that scale, I mean whatever. It just depends. Life is life. When I see this, I want to leave the survey and go read a novel. But to be clear, ultimately, if pushed, I guess I believe in the validity and reliability of these within person 10-item depression scales. If there’s a big change in a respondent, longitudinally, something is probably going on. Anything non-longitudinal has to be excluded. You need to control time-invariant characteristics of individuals.
Social media use has become fully embedded in the way we live. Experimental manipulation with genuine external validity isn’t possible. Measuring the causal effect of social media use on mental health is as hopeless as measuring the effect of participating in capitalism on mental health.
It’s obvious that social media can be toxic, like it’s obvious that neoliberal capitalism can be toxic and dehumanizing. But there’s no causal effect to measure scientifically. Toxic comparison culture on Instagram and TikTok, that tendency to measure one’s self-worth against curated, idealized portrayals of others’ lives, fostering feelings of inadequacy, envy, and distorted perceptions of reality, can be seen as bad. But you can also just celebrate comparison culture by emphasizing success stories, create a narrative of exceptionalism and urgency, focus on visibility, prestige, and outpacing peers, sometimes at the expense of sustainable growth or well-being. Similarly with neoliberal capitalism. Social media surely makes a lot of people miserable. Some people can be jealous of all the glittering consumerist conspicuous stuff they see. It surely make a lot of other people happy and rich. Some other people they are deluded and rich but unhappy, whatever. But there’s no causal science here. There’s nothing to assign. Social media use is part of a a bigger whole, unlike smoking, the effect of which on health can be isolated because the causal mechanism by which it acts on people is physical.
Social science can describe using descriptive statistics or qualitative analysis some of the good and bad things that can come up, that are associated with, social media use. Like journalism generalized, with a sample and guidelines. This is very valuable! This is, in my view, what current studies trying to get at causality are doing. Social science can then theorize, but then you enter Pierre Bourdieu and Mark Fisher territory. It becomes speculative, sociological, philosophical. I think about it this way: No causation without manipulation (instrumental variables come with fundamentally problematic assumptions). What I can manipulate is not at bottom what I care about. There’s a fundamental irresolvable tension.