
The guiding idea behind counterfactual analyses of causation is the thought that – as David Lewis puts it – “We think of a cause as something that makes a difference, and the difference it makes must be a difference from what would have happened without it. Had it been absent, its effects – some of them, at least, and usually all – would have been absent as well” (1973b, 161).
The first explicit definition of causation in terms of counterfactuals was, surprisingly enough, given by Hume, when he wrote: “We may define a cause to be an object followed by another, and where all the objects, similar to the first, are followed by objects similar to the second. Or, in other words, where, if the first object had not been, the second never had existed” (1748, Section VII). It is difficult to understand how Hume could have confused the first, regularity definition with the second, very different counterfactual definition (though see Buckle 2004: 212–13 for a brief discussion).
At any rate, Hume never explored the alternative counterfactual approach to causation. In this, as in much else, he was followed by generations of empiricist philosophers. The chief obstacle in empiricists’ minds to explaining causation in terms of counterfactuals was the obscurity of counterfactuals themselves, owing chiefly to their reference to unactualised possibilities. The true potential of the counterfactual approach to causation did not become clear until counterfactuals became better understood through the development of possible world semantics in the early 1970s (see Beebee 2022).
The best known and most thoroughly elaborated counterfactual theory of causation is David Lewis’s theory in his (1973b). Lewis’s theory was refined and extended in articles subsequently collected in his (1986a). In response to doubts about the theory’s treatment of preemption, Lewis subsequently proposed a fairly radical revision of the theory (2000/2004a). In this section we shall confine our attention to the original 1973 theory, deferring the later changes he proposed for consideration below.
Like most contemporary counterfactual theories, Lewis’s theory employs a possible world semantics for counterfactuals. Such a semantics states truth conditions for counterfactuals in terms of similarity relations between possible worlds. Lewis famously espouses realism about possible worlds, according to which non-actual possible worlds are real concrete entities on a par with the actual world (Lewis 1986e). However, most contemporary philosophers would seek to deploy the explanatorily fruitful possible worlds framework while distancing themselves from full-blown realism about possible worlds themselves (see the entry on possible worlds).
The central notion of a possible world semantics for counterfactuals is a relation of comparative similarity between worlds (Lewis 1973a). One world is said to be closer to actuality than another if the first resembles the actual world more than the second does. In terms of this similarity relation, the truth condition for the counterfactual “If A were (or had been) the case, C would be (or have been) the case” is stated as follows:
(1)“If A were the case, C would be the case” is true in the actual world if and only if either (i) there are no possible A-worlds; or (ii) some A-world where C holds is closer to the actual world than is any A-world where C does not hold.We shall ignore the first case in which the counterfactual is vacuously true. The fundamental idea of this analysis is that the counterfactual “If A were the case, C would be the case” is true just in case it takes less of a departure from actuality to make the antecedent true along with the consequent than to make the antecedent true without the consequent.
In terms of counterfactuals, Lewis defines a notion of causal dependence between events, which plays a central role in his theory of causation (1973b).
(2)Where c and e are two distinct possible events, e causally depends on c if and only if, if c were to occur e would occur; and if c were not to occur e would not occur.This condition states that whether e occurs or not depends on whether c occurs or not. Where c and e are events that actually occur, this truth condition can be simplified somewhat. For in this case it follows from the second formal condition on the comparative similarity relation that the counterfactual “If c were to occur e would occur” is automatically true: this formal condition implies that a counterfactual with true antecedent and true consequent is itself true. Consequently, the truth condition for causal dependence becomes:
(3)Where c and e are two distinct actual events, e causally depends on c if and only if, if c were not to occur e would not occur.There are three important things to note about the definition of causal dependence. First, it takes the primary relata of causal dependence to be events. Lewis’s own theory of events (1986b) construes events as classes of possible spatiotemporal regions. However, different conceptions of events are compatible with the basic definition (Kim 1973a; for an alternative broadly Lewisian take on events see McDonnell 2016 and Kaiserman 2017). Indeed, it even seems possible to formulate it in terms of facts rather than events (Mellor 1995, 2004).
Second, the definition requires the causally dependent events to be distinct from each other. Distinctness means that the events are not identical, neither is part of the other, and neither implies the other. This qualification is important if spurious non-causal dependences are to be ruled out. (For this point see Kim 1973b and Lewis 1986b.) For while you would not have written ‘Larry’ if you had not written ‘rr’; and you would not have said ‘Hello’ loudly if you had not said ‘Hello’, neither dependence counts as a causal dependence since the paired events are not distinct from each other in the required sense.
Convinced by the need to make room in his analysis for causation by (and of) absence – as when the gardener’s failure to water the plants causes their death – Lewis later amended his view to the view that causal dependence is a matter of counterfactual dependence between events or their absences (Lewis 2000: §X; 2004b). We shall largely ignore this complication in what follows; for some discussion of causation by absence see Schaffer 2000b, Beebee 2004b, McGrath 2005, Livengood and Machery 2007, Dowe 2009.
Third, the counterfactuals that are employed in the analysis are to be understood according to what Lewis calls the standard interpretation. There are several possible ways of interpreting counterfactuals; and some interpretations give rise to spurious non-causal dependences between events. For example, suppose that the events c and e are effects of a common cause d. It is tempting to reason that there must be a causal dependence between c and e by engaging in the following piece of counterfactual reasoning: if c had not occurred, then d would not have occurred; and if d had not occurred, e would not have occurred. But Lewis says the former counterfactual, which he calls a backtracking counterfactual, is not to be used in the assessment of causal dependence. The right counterfactuals to be used are non-backtracking counterfactuals that typically hold the past fixed up until the time (or just before the time) at which the counterfactual’s antecedent is supposed to obtain. Thus if c had not occurred, d – which in fact occurred before c – would have occurred anyway; so on the standard interpretation, where backtracking counterfactuals are false, the inference to the claim that e causally depends on c is blocked.
1.2 The Temporal Asymmetry of Causal DependenceWhat constitutes the direction of the causal relation? Why is this direction typically aligned with the temporal direction from past to future? In answer to these questions, Lewis (1979) argues that the direction of causation is the direction of causal dependence; and it is typically true that events causally depend on earlier events but not on later events. He emphasises the contingency of the latter fact because he regards backwards or time-reversed causation as a conceptual possibility that cannot be ruled out a priori. Accordingly, he dismisses any analysis of counterfactuals that would deliver the temporal asymmetry by conceptual fiat.
Lewis’s explanation of the temporal asymmetry of counterfactual dependence comes from a combination of his analysis of the similarity relation together with the (alleged) ‘asymmetry of overdetermination’ – a contingent feature of the world. According to this analysis, there are several respects of similarity to be taken into account in evaluating non-backtracking counterfactuals: similarity with respect to laws of nature and also similarity with respect to particular matters of fact. Worlds are more similar to the actual world the fewer miracles or violations of the actual laws of nature they contain. Again, worlds are more similar to the actual world the greater the spatio-temporal region of perfect match of particular fact they have with the actual world. If the laws of the actual world are deterministic, these rules will clash in assessing which counterfactual worlds are more similar to the actual world. For a world that makes a counterfactual antecedent true must differ from the actual world either in allowing some violation of the actual laws (a ‘divergence miracle’), or in differing from the actual world in particular fact. Lewis’s analysis allows a tradeoff between these competing respects of similarity in such cases. It implies that worlds with an extensive region of perfect match of particular fact can be considered very similar to the actual world provided that the match in particular facts with the actual world is achieved at the cost of a small, local miracle, but not at the cost of a big, diverse miracle.
Taken by itself, this account contains no built-in time asymmetry. That comes only when the account is combined with the asymmetry of overdetermination: the (alleged) fact that effects are rarely overdetermined by their causes, but causes are very often overdetermined by their effects. Taking an example from Elga (2000): suppose that Gretta cracks an egg at 8.00 (event c), pops it in the frying pan, and eats it for her breakfast. What would have happened had c not occurred? The right answer (Answer 1) is that the egg would not then have been fried and Gretta would not have eaten it – and not (Answer 2) that she would still have fried and eaten the egg, but these events would somehow have come about despite her failing to crack it in the first place. The question is: how does Lewis’s analysis of the similarity relation deliver Answer 1 and not Answer 2? In particular, consider worlds where there is perfect match of particular fact until just before 8.00, and then a miracle, and then no perfect match of particular fact thereafter. Call the closest such world World 1. Now consider worlds where there is no perfect match of particular fact before 8.00 (and in particular, Gretta does not crack the egg), a miracle just after 8.00, and then perfect match of particular fact thereafter. Call the closest such world World 2. (Intuitively, in the first case we keep the past fixed, insert a miracle just before 8.00 so that c doesn’t occur, and the future unfolds thereafter according to the (actual) laws. In the second case, we keep the future fixed, insert a miracle just after 8.00 so that c doesn’t occur, and the past unfolds according to the (actual) laws.) Why is World 1 closer to actuality than is World 2?
Lewis’s answer to that question comes from the fact that c leaves very many traces: at 8.02, for example, there is the egg cooking in the pan, the cracked empty shell in the bin, traces of raw egg on Gretta’s fingers, her memory of having just now cracked it, and so on. So in World 2, Gretta fails to crack the egg but then, shortly thereafter, seems to remember cracking it, there is the egg in the pan, the empty shell in the bin, and so on. So World 2 – since it contains all of these events without the egg being cracked in the first place – needs to contain not just one miracle but several: one to take care of each of these effects. World 1, by contrast, requires just the one small miracle to stop Gretta cracking the egg. Hence World 2 contains a ‘big, diverse’ miracle while World 1 contains just one small miracle; hence World 1 is closer to actuality than is World 2; hence Lewis’s analysis yields the correct result that had Gretta not cracked the egg, she would not have eaten it.
The result in Gretta’s case generalises to the extent that causes are overdetermined by their effects but effects are not overdetermined by their causes. Overdetermination of effects by causes does of course happen – as when the victim is simultaneously shot by several assassins – but it is relatively rare, and even when it happens the effect is overdetermined only by a handful of events. By contrast, the leaving of traces is ubiquitous – and (or so Lewis needs to think) the extent of overdetermination, in any given case, is much greater than in cases of cause-to-effect overdetermination. Both of these, however, are contingent features of the actual world (or so Lewis claims; but see §2.1 below).
In general, then, the symmetric analysis of similarity and the de facto asymmetry of overdetermination together imply that worlds that accommodate counterfactual changes by preserving the actual past and allowing for divergence miracles are more similar to the actual world than worlds that accommodate such changes by allowing for convergence miracles that preserve the actual future. This fact in turn implies that, where the asymmetry of overdetermination obtains, the present counterfactually depends on the past, but not on the future.
1.3 Transitivity and PreemptionAs Lewis notes (1973b), causal dependence between actual events is sufficient for causation, but not necessary: it is possible to have causation without causal dependence. A standard case of ‘pre-emption’ will illustrate this. Suppose that two shooters conspire to assassinate a hated dictator, agreeing that one or other will shoot the dictator on a public occasion. Acting side by side, assassins A and B find a good vantage point, and, when the dictator appears, both take aim (events a and b respectively). A pulls her trigger and fires a shot that hits its mark, but B desists from firing when he sees A pull her trigger. Here assassin A’s actions (such as her taking aim) are causes of the dictator’s death, while B’s actions (such as his taking aim) are merely preempted potential causes. (Lewis distinguishes such cases of preemption from cases of symmetrical overdetermination in which two processes terminate in the effect, with neither process preempting the other. Lewis believes that these cases are not suitable test cases for a theory of causation since they do not elicit clear judgements.) The problem raised by this example of preemption is that both actions are on a par from the point of view of causal dependence: had neither A nor B acted, then the dictator would not have died; and if either had acted without the other, the dictator would have died (but see Northcott 2018 for the claim that pre-emption does not, in fact, undermine identifying causation with counterfactual dependence).
To overcome this problem Lewis extends causal dependence to a transitive relation by taking its ancestral. He defines a causal chain as a finite sequence of actual events c, d, e, … where d causally depends on c, e on d, and so on throughout the sequence. Then causation is finally defined in these terms:
(4)c is a cause of e if and only if there exists a causal chain leading from c to e.Given the definition of causation in terms of causal chains, Lewis is able to distinguish preempting actual causes (such as a) from preempted potential causes (such as b). There is a causal chain running from a to the dictator’s death, but no such chain running from b to the dictator’s death. Take, for example, as an intermediary event occurring between a and the dictator’s death, the bullet from A’s gun speeding through the air in mid-trajectory. The speeding bullet causally depends on a, since that particular bullet would not have been in mid-trajectory had A not taken aim; and the dictator’s death causally depends on the speeding bullet, since by the time the bullet is in mid-trajectory B has refrained from firing so that the dictator would not have died without the presence of the speeding bullet. (Recall that we are not allowed to ‘backtrack’: it is not true that if the bullet had not been mid-trajectory A would not have taken aim, and hence it is not true that had the bullet not been mid-trajectory B would have fired after all.) Hence, we have a causal chain, and so causation. But no corresponding intermediary can be found between b and the dictator’s death; hence b does not count as causes of the death.
Lewis’s definition of causation also delivers the result that causation is a transitive relation: whenever c causes d and d causes e, it will also be true that c causes e. The transitivity of causation fits with at least some of our explanatory practices. For example, historians wishing to explain some significant historical event will trace the explanation back through a number of causal links, concluding that the event at the beginning of the causal chain is responsible for the event being explained. As we shall see later, however, some authors have claimed that causation is not in fact transitive.
1.4 Chancy CausationSo far we have considered how the counterfactual theory of causation works under the assumption of determinism. But what about causation when determinism fails? Lewis (1986c) argues that chancy causation is a conceptual possibility that must be accommodated by a theory of causation. Indeed, contemporary physics tells us the actual world abounds with probabilistic processes that are causal in character. To take a familiar example (Lewis 1986c): suppose that you mischievously hook up a bomb to a radioactive source and Geiger counter in such a way that the bomb explodes if the counter registers a certain number of clicks within ten minutes. If it happens that the counter registers the required number of clicks and the bomb explodes, your act caused the explosion, even though there is no deterministic connection between them: consistent with the actual past and the laws, the Geiger counter might not have registered sufficiently many clicks.
In principle a counterfactual analysis of causation is well placed to deal with chancy causation, since counterfactual dependence does not require that the cause was sufficient, in the circumstances, for the effect – it only requires that the cause was necessary in the circumstances for the effect. The problem posed by abandoning the assumption of determinism, however, is that pervasive indeterminism undermines the plausibility of the idea that – preemption and overdetermination aside – effects generally counterfactually depend on their causes. In the Geiger counter case above, for example, suppose that the chance of the bomb exploding can be altered by means of a dial. (A low setting means the Geiger counter needs to register a lot of clicks in order for the bomb to go off in the next ten minutes, thus making the explosion very unlikely; a high setting means it needs to register very few clicks, thus making the explosion very likely.) The dial is on a low setting; I increase the chance of the bomb exploding by turning it up. My act was a cause of the explosion, but it’s not true that, had I not done it, the bomb would not have exploded; it would merely have been very unlikely to do so.
In order to accommodate chancy causation, Lewis (1986c) defines a more general notion of causal dependence in terms of chancy counterfactuals. These counterfactuals are of the form “If A were the case Pr (C) would be x”, where the counterfactual is an ordinary would-counterfactual, interpreted according to the semantics above, and the Pr operator is a probability operator with narrow scope confined to the consequent of the counterfactual. Lewis interprets the probabilities involved as temporally indexed single-case chances. (See his (1980) for the theory of single-case chance.)
The more general notion of causal dependence reads:
(5)Where c and e are distinct actual events, e causally depends on c if and only if, if c had not occurred, the chance of e’s occurring would be much less than its actual chance.This definition covers cases of deterministic causation in which the chance of the effect with the cause is 1 and the chance of the effect without the cause is 0. But it also allows for cases of irreducible probabilistic causation where these chances can take non-extreme values, as in the Geiger-counter-with-dial example above. It is similar to the central notion of probabilistic relevance used in probabilistic theories of type-causation, except that it employs chancy counterfactuals rather than conditional probabilities. (See the discussion in Lewis 1986c for the advantages of the counterfactual approach over the probabilistic one. Also see the entry probabilistic causation.)
The rest of the theory of chancy causation follows the outlines of the theory of deterministic causation: again, we have causation when we have one or more steps of causal dependence.
扫码加好友,拉您进群

