Top

Published in:

01-02-2009 | Scientific Contribution

Causality in complex interventions

Author: Dean Rickles

Published in: Medicine, Health Care and Philosophy | Issue 1/2009

Abstract

In this paper I look at causality in the context of intervention research, and discuss some problems faced in the evaluation of causal hypotheses via interventions. I draw attention to a simple problem for evaluations that employ randomized controlled trials. The common alternative to randomized trials, the observational study, is shown to face problems of a similar nature. I then argue that these problems become especially acute in cases where the intervention is complex (i.e. that involves intervening in a complex system). Finally, I consider and reject a possible resolution of the problem involving the simulation of complex interventions. The conclusion I draw from this is that we need to radically reframe the way we think about causal inference in complex intervention research.

Testing this involves the splitting a sample into (at least) two groups: ‘treatment’ and ‘control’. The treatment group receives the intervention and the control does not, though it may receive a placebo or an alternative (‘standard’) treatment. The two groups have to be ‘well-matched’ (at least with respect to relevant variables) at baseline in order to allow inferences to be made regarding the causal effects of the intervention. The variables that are viewed as relevant to the outcome and so measured before treatment commences are called ‘covariates’. Matching is carried out with respect to these.

In what follows I borrow, in parts, from the excellent presentation of Holland (1986).

The experimental units, however, do not have to be individual people; they might be groups of people or even entire populations of people. If we can find variables to measure on these systems too, then we will often find that they vary over \({\mathcal{U}}\) —cf. Hertzman et al. (1994, p. 67).

As Evans et al. (1994) put it in the title of their book: Why Are Some People Healthy and Others Not?

The example of health H(u) and wealth W(u) provides a nice intuitive example here, since one finds that us with high W(u)-values often have high H(u)-values too. The problem is, of course, that we can’t tell from this correlation (that is, from the joint distribution of H(u) and W(u)) whether high H(u)-values cause high W(u)-values or vice versa, or indeed whether some other ‘hidden variable’ (i.e. a ‘confounder’) is the cause of both high H(u)-values and high W(u)-values.

Note that Pearl’s investigations aim to provide algorithms for prediction from interventions specified as alterations of the joint distribution of some variables in a system in which causal structure remains invariant—hence, we will immediately be faced with difficulties by complex systems in which this stationarity is violated.

N = 1 (longitudinal) trials (such as Interrupted Time Series methods) might appear to give one an escape from this problem. Here exactly one person takes the treatment at t = 1 and an observation is made, whereupon the person ceases the treatment at t = 2 with an observation made again, and then the person takes the treatment again at t = 3, and so the cycle continues up to t = n. This misses the point of the fundamental problem: after the first complete cycle, at t = 3, the conditions are necessarily different from the first instance at t = 1. At the start of each new period in the cycle the conditions will be differ (at least) by the addition of a further treatment.

Causal graph advocates refer to this nice feature of interventions as “arrow-breaking”. See Pearl (2000), for example.

As a prime case of overt bias requiring adjustment consider the following case described by Cochran (1968). The case involves data from a study of mortality rates among three categories of men: non-smokers, cigarette smokers, and pipe and cigar smokers. The rates for these groups, per 1000 men per year, respectively, are: 20.2, 20.5, and 35.5. Prima facie this suggests that pipe and cigar smoking are extremely harmful, but that it doesn’t make much difference whether one smokes cigarettes or not. However, inspecting the mean ages of the groups soon reveals significant differences: 54.9, 50.5, and 65.9 respectively. Hence, the data needs to be reinterpreted: pipe and cigar smokers are older and so we should expect (independently of the smoking issue) a higher rate of death amongst this group relative to the others. Moreover, the non-smokers are older than the smokers and yet the smokers have a higher mortality rate nonetheless. To control for this bias we must further stratify the groups so that only men within the same age-range are compared. How do we know we have got it right once we have adjusted for age? We don’t. There may be other variables that have a similar biasing effect that aren’t controlled for, perhaps because they aren’t known or measurable.

There are, of course, ways that this can be subverted, and for this reason one often adds ‘blinding’ conditions to the experimental setup. Also, randomized experiments often are not ‘really’ random; for example, assignment is carried out on the basis of such things as birthdays, and such like. Since these methods are predictable they are not truly random.

Even if randomization does eliminate selection bias, there are also biasing effects that can emerge during the intervention (i.e. after randomization has been done)—hence differences between the wings of the experiment may not reflect the operation of the intervention alone. In other words, even if we allow for perfect distribution of inhomogeneities at the start, once subjects are allocated to some particular arm there is the potential for biasing effects to interfere (cf. Cox 1992, p. 299). Hence, the initially strong experimental control over the ‘baseline’ properties of the groups quickly deteriorates. This is one of the main reasons underlying Peter Urbach’s rejection of randomization (1985, p. 262-264).

This value is known as a ‘p-value’; it gives a measure of the confidence with which we can accept or reject the null hypothesis. In other words, it helps us to rule out certain apparent correlations; correlations that are really due to chance. By the same token, it allows us to determine the probability that we have a genuine link. Hence, probabilities play a guiding role in the determination of causal links. This is connected to the notion of a level of significance of some result. Following Fisher (ibid.), this level is usually set at the value 0.05. What this means is that whatever the result you got, you can expect to have got it by pure chance at least 5 times out of 100 runs of the same experiment. In other words, it functions as a demarcation criterion for separating fluky from non-fluky correlations.

Note, however, that Lipsey and Cordray are merely surveying current views here, they recognize that difficulties in causal inference can arise even given perfect randomization prior to treatment.

Papineau draws attention to these on the grounds that there are often ethical problems facing RCTs. Thus, he writes that “medical enthusiasm for randomization is dangerous and needs to dampened ... not because it is worthless ... but rather because it is often unethical, and because the conclusions it helps us reach can usually be reached by alternative routes of greater economic cost and less epistemological security” (Papineau 1994, p. 438). In this he is in concurrence with Worrall (2002). However, the discussion appears to indicate that Papineau thinks this is pretty much the only problem with randomization.

As philosophers of science will notice, there is more than a hint of the ‘no miracles argument’ present here.

A more formal (and fundamental) model of this characteristic of a complex system is the Ising model, a two-dimensional lattice of interacting ‘spin’ systems (or just ‘spins’), with spin components s = +1 or s = −1 and interaction Hamiltonian \(H = -J \sum_{\langle i, j \rangle} s_{i} s_{j}\) (with coupling constant ‘J’). This model is used to provide an idealized model of an iron magnet: the Hamiltonian describes the interactions between nearby spins and the interaction with an external magnetic field. There is a phase transition in this system when the temperature is tuned to a certain ‘critical point’, separating order (spins pointing in the same direction) and disorder (spins pointing in the different directions). The point of this technical detour, and the Schelling example, is to exhibit the sensitivity of complex systems to small disturbances.

Think of ‘cooperative phenomena’ like magnetization in which one tweaks the control parameter to the Curie temperature. Interventions in this system spread over the whole of it due to the fact that the correlation length (between spins) becomes infinite.

This has been already been noticed by Rosenbaum (2005, p. 147)—see also Holland (1986).

This complexity is a result of the large numbers of economic agents in markets, the interactions between economic agents, and the feedback loops between the agents and the global patterns their interactions determine.

Altman, D.G. 1985. Comparability of randomised groups. The Statistician 34 (1): 125–136.CrossRef

Campbell, D.T. 1969. Artifact and control. In Artifact in behavioural research, ed. R. Rosenthal and R. Rosnow, 351–382. NY: Academic Press.

Campbell, D.T. and J. Stanley. 1963. Experimental and quasi-experimental designs for research. Chicago: Rand McNally.

Cartwright, N. 2007. Hunting causes and using them. Cambridge University Press.

Cartwright, N. 1989. Nature’s capacities and their measurements. Cambridge University Press.

Cartwright, N. 2002. Against modualrity, the causal Markov condition and link between the two: comments on Housman and Woodward. British Journal for the Philosophy of Science 53: 411–453.CrossRef

Cochran, W.G. 1965. The planning of observational studies of human populations. Journal of the Royal Statistical Society, Series A (Statistics in Society) 128 (2): 234–266.

Cook, T.D. and D.T. Campbell. 1979. Quasi experimentation: design and analysis issues for field settings. Chicago: Rand McNally.

Cox, D.R. 1992. Causation: some statistical aspects. Journal of the Royal Statistical Society, Series A (Statistics in Society) 155 (2): 291–301.CrossRef

Dodge, Y. ed. 2003. The Oxford dictionary of statistical terms. Oxford University Press.

Eaton, D and K. Murphy. 2000. Statistics and causal inference: comment: which ifs have causal answers. Journal of Machine Learning Research 1: 1–48.

Epstein, J.M. 2007. Generative social science: studies in agent-based computational modeling. Princeton University Press.

Evans, R.G., M.L. Barer, and T.R. Marmor. 1994. Why are some people healthy and others not? Aldine de Gruyter.

Giere, R. 1979. Understanding scientific reasoning. New York: Holt, Rinehart, and Winston.

Hartmann, S. 1996. The world as process: simulations in the natural and social sciences. In Modelling and simulation in the social sciences from the philosophy of science point of view, ed. R. Hegselmann, U. Mueller, and K.G. Troitzsch, 77–100 . Dordrecht: Kluwer Academic Publishers.

Hausman, D.M. and J. Woodward. 2004. Manipulation and the causal Markov condition. Philosophy of Science 71: 846–856.CrossRef

Hertzmann, C., J. Frank, and R.G. Evans. 1994. Heterogeneities in health status and determinants of public health. In Why are some people healthy and others not?, ed. R.G. Evans, et al., 62–92. Aldine de Gruyter.

Hill, A.B. 1965. The environment and disease: association or causation? Proceedings of the Royal Society of Medicine 58: 295–300.

Holland, P. 1986. Statistics and causal inference. Journal of the American Statistical Association 81: 945–960.CrossRef

Hsieh, J.-L., C.-T. Sun, G. Y.-M. Kao, and C.-Y. Huang. 2006. Teaching through simulation: epidemic dynamics and public health policies. Simulation 82 (11): 731–759.CrossRef

Kleinbaum, D.G., L.L. Kupper, and H. Morgerstern. 1982. Epidemiologic research. Belmont, CA: Lifetime Learning.

Le Baron, B. 2000. Agent-based computational finance: suggested readings and early research. Journal of Economic Dynamics and Control 24 (5): 679–702.CrossRef

Levy, H., M. Levy, and S. Solomon. 2000. Microscopic simulation of financial markets: From investor behavior to market phenomena. Academic Press.

Lewis, D. 1986. Philosophical papers, vol II. Oxford University Press.

Lipsey, M.W. and D.S. Cordray. 2000. Evaluation methods for social intervention. Annual Review of Psychology 51: 345–375.PubMedCrossRef

MacMahon, B. and T.F. Pugh. 1970. Epidemiology: principles and methods. Boston: Little, Brown.

Mill, J.S. 1864. System of logic, vol I. London: Longmans, Green, Reader, and Dyer.

Medical Research Council [MRC]. 2000. A Framework for development and evaluation of RCTs for complex interventions to improve health. http://www.mrc.ac.uk/pdf-mrc_cpr.pdf.

Olweus, D. 1997. Bully/victim problems in school: facts and intervention. European Journal of Psychology of Education 12 (4): 495–510.CrossRef

Pearl, J. 2000. Causality: models, reasoning and inference. Cambridge: Cambridge University Press.

Pearl, J. 2002. Causal inference in the health sciences: a conceptual introduction. Health Services and Outcomes Research Methodology 2 (3/4): 189–220.CrossRef

Papineau, D. 1994. The Virtues of Randomization. British Journal for the Philosophy of Science 45 (2): 437–450.CrossRef

Petticrew, M., S. Cummins, C. Ferell, A. Findlay, C. Higgins, C. Hoy, A. Kearns, and L. Sparks 2005. Natural experiments: an underused tool for public health? Public Health 119: 751–757.PubMedCrossRef

Pocock, S.R. and D.R. Elbourne. 2000. Randomized trials or observational tribulations? The New England Journal of Medicine 342 (25): 1887–1892.CrossRef

Rosenbaum, P.R. 2005. Heterogeneity and causality: unit heterogeneity and design sensitivity in observation studies. American Statistical Association 59 (2): 147–152.

Rubin, D. 1974. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology 66: 688–701.CrossRef

Rubin, D.B. 1986. Statistics and causal inference: comment: which ifs have causal answers. Journal of the American Statistical Association 81 (396): 961–962.CrossRef

Schaffner, K.F. 1991. Causing harm: epidemiological and physiological concepts of causation. In Acceptable evidence: science and values in risk management, ed. D.G. Mayo and R.D. Hollander, 204–217. New York: Oxford University Press.

Schelling, T.C. 1978. Micromotives and macrobehavior. W.W. Norton and Co.

Suppes, P. 1982. Arguments for Randomizing. PSA: Proceedings of the biennial meeting of the philosophy of science association,vol 1, 464–475. Volume Two: Symposia and Invited Papers.

Tian, J. and J. Pearl. 2001. Causal discovery from changes: a Bayesian approach. Proceedings of UAI 17: 512–521.

Urbach, P. 1985. Randomization and the design of experiments. Philosophy of Science 52 (2): 256–273.CrossRef

Woodward, J. 2003. Making things happen: a theory of causal explanation. Oxford University Press.

Worrall, J. 2002. What evidence in evidence-based medicine? Philosophy of Science 69:S316–S330CrossRef

Title: Causality in complex interventions
Author: Dean Rickles
Publication date: 01-02-2009
Publisher: Springer Netherlands
Published in: Medicine, Health Care and Philosophy / Issue 1/2009
Print ISSN: 1386-7423
Electronic ISSN: 1572-8633
DOI: https://doi.org/10.1007/s11019-008-9140-4

Keynote webinar | Spotlight on medication adherence

Springer Medicine

Causality in complex interventions

Abstract

Keynote webinar | Spotlight on medication adherence

Springer Medicine

Abstract

Please log in to get access to this content

Other articles of this Issue 1/2009

Report on the conference “clinical ethics consultation: theories and methods—implementation—evaluation,” February 11–15, 2008, Bochum, Germany

Bioethics in a pluralistic society: bioethical methodology in lieu of moral diversity

Books received

Psychotherapy and distributive justice: a Rawlsian analysis

The agency problem and medical acting: an example of applying economic theory to medical ethics

Moral agents in medical research and practice