Preventive Stupidity Exists

In the world of Orwell’s 1984,

To the end of suppressing any unorthodoxy, the [ruling] Party inculcates self-deceptive habits of mind to the inner and outer members, thus crimestop (“preventive stupidity”) halts thinking at the threshold of politically-dangerous thought.

Three sayings popular in scientific discussions show that in our world, preventive stupidity exists — and works. In a comment, Kim Ayhus has brought my attention to this.

1. Absence of evidence is not evidence of absence. Ayhus explains why this is wrong. That such an Orwellian saying is popular in discussions of data suggests there are many ways we push away inconvenient data.

2. Correlation does not equal causation. In practice, this is used to mean that correlation is not evidence for causation. At UC Berkeley, a job candidate for a faculty position in psychology said this to me. I said, “Isn’t zero correlation evidence against causation?” She looked puzzled.

3. The plural of anecdote is not data. How dare you try to learn from stories you are told or what you yourself observe!

Orwell was right. People use these sayings — especially #1 and #3 — to push away data that contradicts this or that approved view of the world. Without any data at all, the world would be simpler: We would simply believe what authorities tell us. Data complicates things. These sayings help those who say them ignore data, thus restoring comforting certainty.

Maybe there should be a term (antiscientific method?) to describe the many ways people push away data. Or maybe preventive stupidity will do.

74 Replies to “Preventive Stupidity Exists”

  1. While these sayings are used incorrectly many times, there are good reasons these types of arguments are mentioned in public discourse. Rhetorical strategies in the public space often make use of really bad arguments that are countered by those sayings. Your annoyance is how much of a crutch these arguments are among people who should be arguing in good faith and at a higher level than average.

    1. “Absence of evidence is not evidence of absence.” This was a very useful concept to point out to those people who believed that because nominal national US housing prices had never dropped before that they wouldn’t drop in the future.

    2. Correlations are assumed to be causation in many cases where it is incorrect. People make arguments all the time that point out correlations, think up some mechanic that explains why the correlation is causation but fail to even consider whether the correlation is actually coming from an underlying factor driving both conditions.

    3. Politicians always like to bring out a few people who would be really helped or harmed by the policies. Obvious examples include the widow in need of medical care for her sick kids or Joe the Plumber who would be caught in the higher tax bracket.

    Obviously there are places where arguments 1 through 3 work. So the main problem isn’t about the arguments themselves, but that people use these responses when they assume the person is either arguing in bad faith or is relatively ignorant when that isn’t the case. Furthermore, they often feel like 1 through 3 are enough to settle the question entirely when more argumentation is needed to make their point.

  2. “This was a very useful concept…” Sure, and 1 + 1 = 3 can also be a useful concept, “useful” in the sense of supporting a conclusion you want to reach. It’s still misleading. If someone thinks housing prices will always go up, don’t reply with a slogan, find some evidence that contradicts that belief. Reply with evidence.

    If you think Data X is misleading, fine. You might be right. If you’re right, you should be able to present Data Y that points more conclusively somewhere else. In other words, misleading data is best answered with more data — not sloganized away. Here’s my anti-slogan: Data, not slogans.

  3. How about this one? “You can’t prove a negative”. That aphorism is false, as long as you assume that to prove something means to establish it to such a high degree of certainty that it would be perverse to withhold at least provisional acceptance.

  4. Kim, thank you for pointing this out. Maybe I will be mentioning it for years to come. Orwell’s book has lasted a long time and this was one of its main points.

    Alex, yes, “you can’t prove a negative” is another example. I haven’t heard it in a long time, though.

  5. Can’t you often convert an absence of evidence into positive contradictory evidence?

    For a hypothesis to have meaning, it must impart some prediction. Using miracle healings as an example, I would expect at least one properly documented departure from general clinical experience. Say, a cancer remitting faster than cancer cells can be absorbed by the body. That there is no such document (that I know of) contradicts the hypothesis.

    Am I missing anything?

  6. But if we invert these we get even wronger maxims:

    1. If something has never happened, it probably never will happen.

    2. When two things happen together, one usually causes the other.

    3. A few examples of something mean it usually happens.

    I can’t help wondering if you are following these principles when you reach conclusions that seem incorrect.

  7. No. 2 is actually true- correlation does not equal causation, but is a necessary(and insufficient) requirement for it- i.e. having only correlation or lack thereof you can disprove causation but not prove it. thus it isn’t equal to causation although it is linked to it.

    no.1 and no.3 depend on the definition of “evidence” and the number of said anecdotes as well as the amount of data they contain(most anecdotes contain next to none).

  8. When the general public hears or reads correlation, they are thinking it means a correlation near 1.0. Rightly or wrongly, to them a correlation of 0.0 is just random stuff happening. #2 is correctly used when you have two types of events that can, in fact, both occur because of an underlying cause, but do not directly cause each other. It may be that every time it gets hot I drink a beer and my dog drinks water, but in general, I don’t have a beer every time my dog drinks water (it might be fun, but unproductive), and my dog does not feel the need to drink water whenever I have a beer. I use #3 whenever someone tries to tell me the equivalent of “the cousin of this guy that my friend’s sister met on a bus” makes an assertion that is untestable / unimaginable / unmeasured. Anecdotes are stories, not measurements and yes, “How dare you try to learn from stories you are told or what you yourself observe” is a reasonable response in many cases. For example, hallucinations that are told to me or that I, myself, observe are no basis for the existence of ghosts or gods.

  9. Hal,

    The claim is that the maxims are false, which means that their negation is true. So to properly “invert”, simply negate the sentences and remove double negatives:

    1. Absence of evidence is evidence of absence.
    2. Correlation equals causation.
    3. The plural of anecdote is data.

    #1 is amenable to logical proof as shown. #2 is false on its face, but the obvious implication from the discussion is not that the statement is false per se, but deployed as code for something that is in fact false. If #3 is false I’m not sure why people continue to do case studies.

  10. You know, I really appreciate the post because it’s made me think. I have said all 3 of those and I want to rethink why I do. Now, you suggest I’m doing this in order to push away data I don’t like. I don’t think I am (of course, cognitive bias being what is I woudn’t, would I?). In Saying 1, I’m trying to make the point that not having any good data or examples doesn’t necessarily mean it’s so. But, at least, I should say that it’s not PROOF of absence. Except now I have to sortof believe in elves. In saying 2, I’m talking about how post hoc logical fallacy which can at least be suggestive if not proof. In the last, I think we’re really trying to make the point that just because you had an experience and someone else did too, that doesn’t mean that you’re done. Is that slogan too vague to mean that an n of 2 or 3 ain’t enough?

  11. As a general rule, I like your posts, but this one bothers me.

    There are many excuses and tricks people use to alleviate cognitive dissonance. These three are no worse than any other, and are in fact actually useful logical tools – unlike many of the the other ways.

    I REALLY don’t like the criticism of ‘plural of anecdote’ – because it is painfully true, and it points out a MAJOR weakness in human cognition. We ALL tend to over estimate the prevalence of opinions and ideas that we hear often – whether we agree with them or not. We have the same problem in risk assessment. Ask most people whether they are more likely to be struck by lightning or die in a terrorist attack. We hear more about child abductions so they must be on the rise – also not true.

    It has *nothing* to do with learning or not learning from your experiences. It has everything to do with making sure what you do learn is both useful and true. It is a defense against over-generalization, and a reminder that ‘most of the time’ needs only happen 50.1% of the time to be true.

  12. Alrenous, I believe that applying the saying to controlled experiments would rather strain its heuristic value.

    All of these pearls of wisdom would likely have the word “necessarily” in there if not for simplicity’s sake. Even the more universally accepted heuristics produce absurd conclusions when taken to their logical extreme, that applies doubly to these rules of thumb.

  13. A few comments on each point:
    1. This can be usefully deployed in discussion when evidence is lacking and there hasn’t been much (or any) effort to find evidence. If you have evidence of everything, then sure an absence of evidence is evidence of absence. We don’t live in that world. This also really hinges on your ontology and on what kinds of physical traces you expect to find from the thing that you argue exists.

    2. This is true. A correlation between two variables does not necessitate a causal link between them. The fact that causation requires correlation is irrelevant to the previous point. I am shocked that the candidate didn’t realize this. To make this more concrete, umbrellas opening do not cause barometers to dip and low barometer readings do not open umbrellas, even though the correlation between them is close to 1.

    3. The plural of anecdote isn’t data because data assume some standardized way of gathering or collecting the individual datums. If your anecdotes are all comparable, then the plural of those anecdotes is data. In most cases this isn’t true. This, however, does not diminish the role of case studies (as someone above mentioned). They can excel in areas where large-n studies fail, such as handling complex causal processes. Case studies can also use process tracing to get at causality, which is often hard to do in large-n work.

    I agree that everyone should be open to information that contradicts their worldview or their theories, but in making that point you radically misinterpret some very good advice.

  14. Actually, I have to quibble just a little bit on number 3:

    “3. The plural of anecdote is not data. How dare you try to learn from stories you are told or what you yourself observe!”

    This reminds me of comment attributed to Pauline Kael (erroneously, but the point still stands) when Nixon was elected; “I don’t know how Nixon won. No one I know voted for him.”

    Just because the stories you are told (else how would racism ever end when dealing with racist communities) or what you yourself observe (pick your favorite bigotry here) all say one thing does not necessarily make it true.

    At the same time, if the standard of truth is to be an empirical measurement of all possible data points, nothing will ever be decided. I suppose that’s the failure of aphorism…

  15. Then if (or rather, since) #1 is not true, Orwell is twice right, or rather his evil alter-ego, the character O’brien & his Inner Party of Ingsoc. i can destroy the evidence, and CREATE the absence. “Collective Solispsism, if you like….”

  16. Ryan, you write that these three slogans are “very good advice.” The underlying message of all three is “shut up”. I believe that is almost never good advice. Perhaps you can supply an actual example — not a made up one — where one of these three slogans was said and was “very good advice”?

  17. Comment on Seth observations

    1) It is…..

    2) It is not……

    3) It is…..

    ‘Preventive Stupidity’ is a real thing and can be found in all Red states.

  18. correlation is a necessary but not sufficient condition to indicate a causal relationship. Lack of correlation is pretty good evidence of the null hypothesis

    Multiple anecdotes do not equal data. Data are the result of experimental observation under carefully controlled, repeatable conditions. Unless you are a social scientist where data consists of careful observation of multiple anecdotes.

  19. It is true that zero correlation is evidence against causation. However, it does not follow from this premise that nonzero correlation is evidence for causation. A implies B does not mean that not A implies not B.

    The plural of anecdote is NOT data. Data is rigorous. Anecdote is random. But in between lies observation, semistructured and containing hints about how to obtain data.

    Kim Øyhus is wrong as he assumes that absence of evidence is the same nonexistence of evidence. This ignores the possibility that evidence may not have been looked for yet, or may not have been looked for in the proper way.

    When one is trying to prevent stupidity, one should avoid being stupid oneself.

  20. Seth, it doesn’t sound like you’re listening to what people are saying. The first commentator is absolutely right.

    At least numbers two and three are simple admonitions against poor reasoning. I’m sure that some people are saying ‘shut up’ when they say these things, but certainly some people are saying ‘maybe you should be a little more careful in your thinking.’

  21. I can understand what you’re trying to say with #1 and #2, but your objection to #3 has a serious problem. It’s that there are 2 huge things stacked against our ability to know the truth through simple personal experience: Flawed perceptions and selection bias.

    Before you dismiss flawed perception, go research the reliability of eye-witness testimony and the malleability of memory. We absolutely SUCK at understanding and processing what we experience and constructing accurate models based on them. Really. It’s more likely that any memory you have is more wrong than it is right, no matter how accurate you THINK it is.

    Selection bias forces us to think that what happens to us in somehow common, even if we know it isn’t.

    Our brains are in no way required to get things right, only just right enough to not die (poison plant, man-eating beast, location of cliff etc.). That includes an enormous amount of false information. We just aren’t adapted to create correct assumptions.

    These glib (and occasionally inaccurate) sayings are just shorthand ways of pointing out common errors in the ways we judge evidence. They are only useful if the people using them actually understand what they expand to mean. They only being to sound Orwellian if repeated like a mantra or a magic spell, devoid of meaning and thought.

  22. Sorry Seth, I disagree.

    These are useful heuristics, along with Hume’s and Occam’s razors. I think of them as a primary level BS filter.

    For example: so you’ve invented a perpetual motion machine, Seth? Interesting. Sounds like complete BS but I’m prepared to entertain the very slight possibility that you have actually discovered something unusual. Knock yourself out – show me the data. Get your results replicated. Mix, repeat, collect your Nobel. But until you do, don’t expect me to believe you just because the fact that nobody else has managed to do it somehow implies that your solution must be correct one.

    The underlying message of these things is not about ‘shutting up’, it’s about not fooling oneself – with yourself being the easiest person to fool, to paraphrase Mr Feynman.

    Humans are fallible, unreliable and biased. This is why scientific method is used and why science is superior to ‘other ways of knowing’ in trying to understand how the universe works.

  23. Seth, I’ve never really interpreted these slogans as saying “shut up.” They just seem like things that you should keep in mind when you are working. Sometimes people run a regression (to find a correlation), see the result they expect, and then make a leap to assume that they have found a causal relationship. I can’t think of a specific example right now, but this kind of mistake shows up frequently in undergrad and grad student work and I know I have noticed instances of it in major journals. If you truly find that hard to believe I can dig up a specific example.

    I can’t help but notice that you didn’t actually respond to any of my three critiques. If your claim is that people can use these slogans to silence other people, or reduce the prestige of competing work, or block out information that doesn’t fit their worldview, then I agree. These slogans can be used poorly and their effect can be negative. Maybe I am just young, but I haven’t had these phrases used against me (yet) and when I have heard them I really have found them useful. I have learned from having people point out if I jump from correlation to causation. I have also learned from people challenging me on the generalizability of my findings (#3). I think that, like so many things, in the right hands these slogans can be used positively and in the wrong hands they can be used negatively. However, logically, I think that they are far more neutral than you are letting on.

  24. Ryan, you write: “Sometimes people run a regression (to find a correlation), see the result they expect, and then make a leap to assume that they have found a causal relationship. . . I know I have noticed instances of it in major journals. If you truly find that hard to believe I can dig up a specific example.” Yes, please dig up a specific example. I’m guessing that when you look closely at your example, you will see that the researchers did not “assume they have found a causal relationship” but rather “assumed [usually reasonably] they have found evidence that makes more plausible a causal relationship.” There’s a big difference.

    Ben, you say these are “useful heuristics”. Could you give an actual example? I gave two examples that supported my points. One was Kim’s experience, where “absence of evidence . . . ” was used to ignore a perfectly good point. The other was the case of the job candidate, where “correlation does not equal causation” had led a very smart and accomplished person into misunderstanding how correlation and causality relate.

    To other critical commentators: let me repeat, I gave examples. Not hypothetical ones, stuff that actually happened. None of you has done so. Of course I might be wrong but data-free arguments aren’t going to convince me of that, nor should they. Where’s the data?

    That’s another problem with these slogans: They encourage data-free argumentation.

  25. Okay, here is one quick example of how someone could have used a reminder that correlation is not causation (I just googled “example of correlation and causation mixup” and then started reading). In 1999, Nature published an article by Quinn, Shin, Maguire, and Stone entitled “Myopia and ambient lighting at night”. The general claim was that children that have lights on at night are more likely to develop myopia.

    Other later studies found that what was actually happening was that parents who were nearsighted were passing that on to their children. By coincidence, parents who were nearsighted also were more likely increase the light in rooms at night—because they have poor eyesight. You can read about the later studies here:

    This is an excellent example of how, while ambient light and myopia in children correlate, it is wrong to assume causation. The original result was published in Nature.

    You do seem rather agressive on this, so I can’t help but push back and ask you to respond to my original points. Do you really think that, given the world of imperfect information that we live in, absence of evidence is consistently evidence of absence? Do you actually think that your rebuttal to the job candidate said anything about the logically necessary relationship between correlation and causation? Do you really think that data is nothing more that a collection of (unsystematically gathered) anecdotes?

    Again, I understand how these slogans can be used negatively, but I think you are very wrong to call them “preventative stupidity” and I feel bad that you have obviously spent a lot of time dealing with closed-minded researchers. These slogans contain important lessons and they can be used well.

  26. Another weird point that I noticed as I skimmed the comments, is that you are demanding data as evidence for arguments that are often based on pure logic. I would like to see you give me a data-based proof for 1+1=2. It isn’t going to happen. I’m starting to worrying that you are intentionally misrepresenting people’s comments in your responses.

  27. Seth the utility of each one of those heuristics is demonstrated by counter-examples:

    The anti-vaxers who are convinced that autism is caused by vaccinations.

    The AIDS denialists who believe that HIV does not cause AIDS.

    The 911-truthers who think the US government blew up the twin towers as part of some arcane conspiracy.

    AGW contrarians who think that blog science trumps 150 years of physics.

  28. Hi Seth,

    I do really like the general argument you make here but I have to chime in with some more support for ‘correlation does not equal causation’ being not only good advice with a proper understanding of its true meaning, but also logically true.

    In your example the main problem seems to be that the person has an educational level which should allow them to differentiate between “absence of correlation IS absence of causation” and “correlation does not (always) equal causation”. The implied ‘always” is the big clue here. She should have known better.

    By contrast a young child could draw the conclusion that birdsong brings the newspaper which is a clear example of causation and correlation being different. By pointing out that the same cause (the sun rising) brings both effects you encourage them to think more deeply about problems.

    In fact ‘correlation is not causation’ should be used to encourage greater data gathering rather than a cessation. If there is a correlation it is a great idea to determine if it’s causal or resulting from an additional extra cause.

    The third paragraph of the relevant wikipedia article gives a very good clinical example of why correlation should be further examined to discover if it is or is not causation.


    PS. I’m also forced to point out that your ‘example’ is an anecdote and not data. By your own third point it is merely a chance to go explore new data 😉

  29. Pjcamp, you have misunderstood my proof a little.
    Even though absence of evidence and nonexistence of evidence are different, they look the same, and give the same conclusion in my proof. This is good, because it is more general. Since it is a proof, it is true.

    There are only 2 things to do to get knowledge:
    1. Gather more data.
    2. Get better explanations, models, theories. (These are fundamentally the same)

    When people do neither, I take this absence as evidence they have nothing to contribute.

    This may be the sanest discussion I have seen involving my proof. The usual reaction is to arrogantly misunderstand it, claim faith is better than proof, etc.

  30. I have used #2 on multiple occasions and always as a response to conclusion-jumping. Mostly in response to arguments such as (and I do not quote directly
    ) “People of specified ethnic group(s) are overrepresented in the statistics of criminal act X; therefore persons of said ethnic group(s) are more by nature prone to commit said act.” Persons using this line of argument almost always have a political agenda attached to it that is fundamentally not based in fact but in emotion. They focus on one factor that suits that agenda and ignores all other data. They are not interested in other correlations that disproves the importance of the ones that support their agenda. And because they already know what results they wish to find, their logic will always be corrupt.

    One augmentation one could propose is “Correlation does not equal causation but is a cause for further open minded inquiry.”

    To use a completely different example where correlation = casuation leads to faulty conclusions I would like to mention the phenomenon of the Cargo Cult.

  31. I believe all three of these statements can be valid as arguments in themselves. They are absolutely not proofs of one thought, nor do they read as such:

    1. Absence of evidence is not evidence of absence. Just because you haven’t seen evidence of something doesn’t mean it doesn’t exist – it still might. This is a valid argument.

    2. Correlation does not equal causation. Just because a particular scenario exists in some circumstances along side another scenario does not mean that the one caused the other – although this is possible, it’s mere correlation does not warrant proof of causality. This is a valid argument.

    3. The plural of anecdote is not data. By definition, anecdotal evidence cannot be used as data, at least as far as any scientific study is concerned because by the very definitions (which may vary slightly by source, this set gathered from wikipedia for expedience although more traditional sources will still support my argument):
    – (1) Evidence in the form of an anecdote or hearsay is called anecdotal if there is doubt about its veracity; the evidence itself is considered untrustworthy.
    – (2) Evidence, which may itself be true and verifiable, used to deduce a conclusion which does not follow from it, usually by generalizing from an insufficient amount of evidence.
    If your argument is that anecdotal evidence is just evidence that has not been set in a laboratory environment then perhaps enough of it could be used as data but the term data means groups of information that represent the qualitative or quantitative attributes of a variable or set of variables. Being ‘anecdotal’ means colloquially that the data has no compensated or measured qualitative or quantitative measurements to put with other ‘data’ in order to reach a conclusion.

    I believe you have a very valid argument that these three errors in logic are misused to dismiss data that could very well prove valid. I also believe that your argument against them swung the pendulum too far in the other direction by basically stating (unless I misread your intentions) that anyone who uses these as fallacies of logic has missed the point and is incorrect.

  32. I’ve been thinking about this all night, and I’ve finally come to a recognition of what you’re experiencing:

    The missing negative.

    Suppose, if you will, that you have a thousand hypothesis’ guided by absence of evidence. When you stop to collect data, will you see evidence of absence at a statistically significant rate?

    Suppose, if you will, that you recognize a thousand correlations. Upon further analysis, will you see that a statistically significant number of them are causations?

    Finally, listen to a thousand n=3 unstructured and uncontrolled stories. Upon actually generating solid experimental data behind all one thousand, how many of the stories will be significantly shown to have reflected reality?

    You’ll note that I’m not telling you the answers to these questions. But this is the path you should be considering. The problem with following absence of evidence, correlation, and anecdote is that when you actually do get data, the agreement rates are simply not high enough. Prediction rates are too low.

    I may be the first person ever to tell you you just weren’t sufficiently meta 🙂

    (That being said, all three are very good tools for forming questions and hypothesi.)

  33. On the “correlation is not causation”. As many others have noted correlation CAN be a necessary, but insufficient condition of causality. However correlation only reliably detects linear relationships, so there are good reasons to deny that correlation = causation.

    Over on Boing Boing someone brought up the example of the relationship between “human ability to live” and “levels of oxygen”. At low levels of oxygen the ability to live is low, at medium levels it is high, and at high levels it is again low. There is a perfectly causal relationship between levels of oxygen and ability to live, but no linear relationship. And the same goes for every non-linear relationship you can care to think about.

  34. It took me a while to untangle the negatives to make sense of this post (i.e. “‘correlation does not equal causation’ is not true”.) What I found was that by spinning them into simpler truisms made more sense (although my logic terminology is very very rusty):

    1. Absence of evidence is evidence of absence.
    2. Causation implies correlation. (That is, causation -> correlation; therefore (no correlation) -> (no causation) but correlation does not imply causation.)
    3. Data (deliberately collected information) is stronger evidence than anecdotes (arbitrary information).

    They are all intertwined as well, which I think confuses the lay person.

    Anecdotes can refute causation but not correlation: causation requires stronger evidence than correlation. Thus, anecdotal evidence does not refute evidence of absence. Because data is stronger evidence than anecdotes, absence of anecdotes alone is poor evidence of absence.

  35. Kim, very well put about the two ways to increase knowledge. I completely agree.

    Ryan, I looked up your example — the paper where you say the authors “could have used reminding” that correlation does not equal causation. Here’s what that paper concluded:

    Although it does not establish a causal link, the statistical strength of the association of night-time light exposure and childhood myopia does suggest that the absence of a daily period of darkness during early childhood is a potential precipitating factor in the development of myopia.

    The authors didn’t need reminding.

  36. Seth:

    “Ryan, I looked up your example — the paper where you say the authors “could have used reminding” that correlation does not equal causation. Here’s what that paper concluded: […] Perfectly good conclusion.”

    But this is still an example where correlation did not imply causation, and the author’s conclusions, however hedged, turned out to be wrong. A more scientific approach would have been to publish the paper saying that there was a correlation, nothing else, and then start a study to find out whether the link was actually causal, or whether there was a separate underlying cause.

    Your “gotcha” question to the job candidate still isn’t remotely valid. What the job candidate *should* have said, if she had been on her toes, was

    “No correlation may imply no causation, but that doesn’t prove that correlation implies causation. Correlation may be a necessary feature of causation, but it is not a sufficient feature. Since it’s necessary, we can say that the absence of correlation implies the absence of causation, but since it’s not sufficient we can’t say that correlation implies causation.”

    To put it another way, being able to read does not imply that one is a tenured professor. “But,” you say, “doesn’t *not* being able to read imply that one must *not* be a tenured professor?” Yes… but that still doesn’t suggest that being able to read implies one is a tenured professor.

  37. I fear the post I just typed was lost in the aether. I’ll try again.

    Dambisa Moyo’s book Dead Aid is another good example of mixing up correlation and causation. Here are two reviews from across the political spectrum that mention the issue:

    Her basic problem is that aid can cause poverty (or low savings, or low economic growth) or respond to poverty. An OLS regression won’t tell you what is going on, and she make a very forceful argument for causation from this correlation. It is interesting to note that in the introduction Niall Ferguson distances himself from Moyo, “The correlation is certainly suggestive, even if the causation may be debated.”

    Moyo should have been reminded that correlation≠causation. For a contrasting take, here is a responsible scholar reviewing the state of the literature:

  38. Ryan, I believe Moyo knows quite well that “correlation does not equal causation”. To convince me that the saying is helpful, please give an example — a case where it was used and seemed helpful. In my experience, someone makes a good point, well aware that a correlation doesn’t prove causation but does increase its plausibility (e.g., the ambient lighting paper), and then someone makes this data-denying comment. No help at all. If I say “1 + 1 = 2” and you say “1 + 1 doesn’t equal 3” it isn’t helpful.

  39. “Absence of evidence is evidence of absence.”

    Reading the proof involving this has had me going for the past few hours. Let me provide two personal academic experiences that go both ways in favor and against the claim.

    When I was in law school, absence of evidence was evidence of absence. This is also true for legal practice. If you do not have evidence to back up your accusation, then your accusation is false. The reasoning behind it is obvious.

    But when I was an undergrad in Philosophy, we were taught to think that absence of evidence was, well, (usually) absence of evidence. Take the question of evolution vs. creation. The last class I took as an undergrad dealt with ethics, religion, and the human experience in general as evolutionary constructs. We read David Sloan Wilson (Darwin’s Cathedral), Kim Sterelny (Thought in a Hostile World/The Evolution of Human Cognition), that sort of thing. The instructor was absolutely clear that lack of evidence did not exclude possibility. Yes, it is technically possible that we were created in seven days by a divine being. Similarly, it is technically possible that giant, purple monkeys are floating around the dark side of Pluto. But is either probable? Probably not, when you look at the body of evidence, so we will assume, for the sake of the class, that evolution is correct. We were taught not to exclude either the as bona fide fact, but to “place our chips” on probability, not possibility. This was especially true for Epistemology.

    Perhaps the instructor was just making a PC move to make the class sound more palatable to creationists. I am not an instructor, but I am sure there was also some conditioning going on in both cases. Law school professors condition their students to think one way, because it’s how the profession works. Same thing with Philosophy professors. I can also see the benefits of the practical approach of the law vs. the speculative approach of Philosophy within each discipline. The former approach efficiently cuts out all the junk in a no-nonsense fashion, while the latter opens the gate for all sorts of wild speculation, much of which could be false. The former approach also distills the most “useful stuff” out of the “information sludge.” So if one of the goals of science is to provide useful, applicable information, it is preferable to scientists to think like a lawyer, rather than a philosopher.

    I can also see the truth to the law school approach. If someone says there is a cat in my room, and I search my room thoroughly and find no cat, then there is no cat in my room. But conversely, there are also these things called externally unverifiable truths, questions that can’t, by their very definition, be currently proven outside of any individual, questions like life on other planets, God, etc. All externally unverifiable claims are false under the law school approach, even if they aren’t false, regardless of how rational they may be.

    So, doesn’t the truth of “Absence of evidence is evidence of absence,” or “Absence of evidence is not evidence of absence,” boil down to the goals of the actor? Don’t you adopt one stance or the other depending on what you are trying to accomplish? Isn’t it just a matter of functionality?

  40. Buster above gives real world examples of how mistakenly thinking that correlation implies causation leads to real human misery. People usually use statement #1 to justify their belief in God, people may sometimes use statement three to ignore useful data (though it seems more like a snarky response to someone pointing out that their grandparents all smoked and never got cancer), but honestly statement #2 is the proper response to a logical mistake that people make all the time and is very literally true. Examples in real life abound, watch FOX News for five minutes.

    You say that it is important not to apply these statements liberally because we use them to ignore data that does not fit our world view. Statement #2, however, does not say anything about data, it says something about the conclusions being drawn from data.

    The comments here contain multiple anecdotes of real life mistakes being made and harm being done by believing that a correlation was a causation. If these multiple anecdotes are data (not #3) then should this be evidence that the statement is a useful reminder? Or is this data too inconvenient for your world view?

  41. “Buster above gives real world examples.” Buster’s second example — about Cargo Cults — is irrelevant. Of course people sometimes draw conclusions from correlations (doing X might help) that turn out to be wrong (doing X doesn’t help). It would be unwise not to test our imperfect conclusions. By testing them, we learn. In the first case (racial differences) I have no idea what happened after Buster said what he said. So it’s hard to tell if what Buster said was helpful. (I claim these three sayings aren’t helpful.) Did Buster use the saying to push away data he didn’t like? Possibly. Another problem with these examples is that neither of them involve “scientific discussion” which is what I’m talking about here.

    bennetta, I thinks it’s unfortunate that your philosophy professor didn’t teach you that absence of evidence for X reduces the plausibility of X, in the sense that the greater the absence, the lower the plausibility. Your belief in X can start off anywhere — high, medium, low. But as absence of evidence increases — as more time passes, for example, without evidence for X showing up — your belief in X should go down. This is one of the points made on Kim’s webpage. I haven’t seen it made elsewhere.

  42. I think the problem is that these sayings are used in different ways by different people.
    For a working scientist, each of these is a request for more data, or more information about the current data. It often reflects considerable background knowledge about the particular things being measured. In the myopia example above, a discussion about correlation and causation does involve the logical implication of a correlation between night light and myopia, but this is also in the context of what these scientists know about the mechanisms of lens and corneal distortion, and genetic mechanisms in eye development.
    The same goes for “the plural of anecdote is not data” – scientists certainly use anecdotes in reasoning as any other human does, but they use these as inspirations for controlled experiments. Much of psychology begins with a skepticism of common sense (I know Seth knows this, but others might be interested in Stanovich’s How to Think Straight about Psychology, or Lilienfeld et al’s recent 50 Great Myths of Popular Psych).
    So, I think each of these aphorisms can be deployed as requesting more data (or more information and context about the current data), reflecting the skepticism of science.
    But the danger is when moves from a skeptic, who is dubious, but able to be convinced, to a cynic, who uses these sayings to dismiss all correlations, anecdotes, or absences of evidence as equally meaningless.
    When these sayings are wielded by a cynical layman (or sometimes even a cynical scientist) they can lead to preventive stupidity.
    The way I try to handle this in the psychology classes that I teach is to first try to get students to be skeptical (using at least the second two of these sayings). But then (and this usually has to wait until senior year, both for knowledge, and for maturity) try to get them to see that all correlations are not equally lacking in causative indication. For example, comparing the correlations between IQ on identical and fraternal twins raised together and apart, does let you know a bit about how complex the nature and nurture question is when it comes to (at least this particular kind of) intelligence. But to evaluate these correlation is not a purely logical exercise, but one which needs a fair amount of background knowledge.

  43. Regarding 2: If A and B are correlated, then there are several possibilities:
    a. A causes B.
    b. B causes A.
    c. There is some C that causes both A and B.
    d. There is no causation. It is just a coincidence.

    If, for the sake of argument, the correlation looks good enough to exclude d, then, yes, you can say that you have reason to think there is causation. But the mischief enters when you decide what causes what.

    Some cases might not be too hard. Does the rain cause the umbrellas to appear? Or do the umbrellas cause the rain? Or do weathermen cause both the rain and the umbrellas?

    Even experts can have trouble in harder cases. Gary Taubes discusses one in Good Calories, Bad Calories in the chapter on the Conservation of Energy:
    Energy stored = EnergyIn – EnergyOut.
    The left side of the equation correlates with the right side (and correlates _very_ well, since they are always equal). But everyone jumps to the conclusion that the right side _causes_ the left side. And thereby we get a whole bunch of diet-and-exercise advice that doesn’t seem to work very well.

    We could think of a bunch more examples involving global warming and economics and so forth – some silly, some serious.

    Just knowing that there is _some_ causation doesn’t settle much. Thomas Aquinas said almost a millennium ago that every event is caused by another event. The correlation seems very good – events seem to always happen right after other events happen.

  44. Cedar, I agree that teaching students that correlations vary in persuasiveness is a good idea. If everyone who said “correlation does not equal causation” could name 10 important research projects that began with the observation of a correlation . . . the world would be a better place.

  45. Seth,

    If your slogan is data, not slogans, why are you proclaiming the value of an absence of data?

    Burdens of proof matter. The problem is that there are a tremendously large set of things for which we lack data. It is tempting to convert that into an information source — well, we don’t know that the core of the earth isn’t, in fact, made of giant radioactive turtles — but down that path comes insufficient predictive powers.

    Your problem is you see the world through too clear a lens. Bear with me a for a moment. You’re a really smart guy (and I enjoyed your diet, btw). Your experience of absent evidence comes only after a presumption that, were certain things to be true, there would be specific things present. So you’re seeing violated predictions, which is not actually the same thing as absent evidence.

    It’s similar for your experience of correlations and anecdotes. These things are preconditions for your entire processing framework. You’ve _already filtered out_ the useless correlations and anecdotes, a massive number of which could simply be dumped into this comment thread.

    And that’s OK. Without course grain loose data to guide us, we’d never be able to form a useful hypothesis to test. Exceedingly noisy data has its uses.

    What you’re missing is the alternate universe, where the burden of proof is shifted, where correlations guide policy, where one person’s story is enough to guide the reality we operate in. That world sucks, Seth. Consider the situation of alternative medicine. Some of it works. Damned if we have any idea what — there is so much noise pretending to be signal.

    What you’re missing about the alternate universe, specifically, is that not everybody is you, and there’s an infinite set of unevidenced correlated anecdotes that *you’d* filter and *they* wouldn’t that *the truth* would have to compete with.

  46. Seth, I am glad that I pursued this because I think I have found at least one core point of disagreement between us. Let me see if I state your views clearly:

    You think that correlation says something about causation because causation requires correlation. You think that finding a correlation between X and Y can make us more confident that X causes Y (than we were before finding the correlation).

    On this particular issue, what bothers me is that finding a correlation between X and Y might make us more confident that X and Y are related, but it says absolutely nothing about causal order (or spuriousness). Moyo uses a correlation (poverty/low savings & foreign aid) to argue for causation. You could just as easily use the same regressions to argue that poverty causes aid to increase, because aid is intentionally directed at poor countries.

    So, while finding a correlation should increase our confidence that something is linking X and Y together, it should not increase our confidence that X causes Y, because it adds an exactly equal amount of evidence to the argument that Y causes X. This is why we can’t move from correlation to causation.

    Obviously, pointing this out is useful a starting point instead of an ending.

  47. Ryan, I’m afraid this is beside the point — I was writing about whether sayings such as “correlation doesn’t equal causation” are helpful. Not about what can be inferred from correlations. But you’re right, we do disagree here. Let’s say X causes Y. Then usually we would expect them to be correlated. We start with four possibilities: X causes Y, Y causes X, both are caused by something else, and X and Y are unlinked. When we observe lack of correlation it reduces the plausibility of X and Y are unlinked. The plausibilities of all 4 events must add to 1. Reducing one of the four must increase the sum of the other three. Thus unless we’re sure of an alternative explanation for the correlation, observation of the correlation should increase the plausibility that X causes Y.

  48. seth–

    If I may distill your argument, “Since lack of correlation makes causation less likely, correlation must make causation more likely.”

    Here is another logical statement in this family: “Since I am not drowning, it is more likely that I am flying.”

    Yep. That is entirely true. It is more likely. Very, very, microscopically more likely. And there’s your problem right there:

    You are allowing any increase in confidence to imply plausibility, no matter how small. And that, unfortunately, is untrue. Consider, for a moment, the number of metrics in the universe right now that have increased in the last year. They are all correlated, Seth.

    They are not all causative. In fact, the odds that any two randomly selected correlative metrics have even a remote chance of being reflective of a causative relationship is infinitesimal.

    Now, you can filter your correlative sets, looking for those where a relationship is plausible. But — and here’s the key — the plausibility doesn’t come from the correlation. The plausibility comes from the fact that you’re a really smart guy, with a theory of the world, against which you can compare sets of relationships. It is your theory that allows you to know where to look, to design your experiments.

    Mere correlations are rough filters. It is in fact the case that there are more metrics overall than there are correlative metrics. But it is a very, very, *very* low quality information source — good for exclusion, but wayyyyyyyyy too conflated for inclusion.

  49. Seth: The problem seems to be that you think that “Correlation does not imply causation” is a useless slogan because scientists already know that correlation does not imply causation.

    However, the fact is that many non-scientists also read studies, see correlations and then leap to “causations.” Such non-scientists may be people in decision-making positions, such as legislators and educators. When faced with arguments assuming causation when there is only evidence for causation, I think that reminding people of the slogan is useful.

    For instance: there are numerous statistics that show a correlation between increased helmet usage and increased bicycle accidents. This has lead many people to assume that wearing bicycle helmets causes accidents.

    An example can be seen at The authors cite the statistics showing the correlations, mention in passing that there could be other causes for the increase in bike accidents (such as increases in ridership), yet still go on to state multiple times that “evidence shows helmet use increases the accident rate.”

    No, the evidence cited does not show that helmet use increases the accident rate, and yet this is a serious legislative campaign that may seriously affect bike helmet laws. (Whether one agrees with bike helmet laws is irrelevant with respect to the science involved.)

    In a similar vein (this time from Freakonomics): after studies showed that there was a strong correlation between children’s test scores and the number of books in a child’s home, then-Governor Blagojevich announced a plan to send every child in the state one book a month until they reached kindergarten, apparently under the assumption that the physical presence of books in the home would actually cause the children to become smarter. Again, a time when the mistaken assumption that correlation implies causation almost had large legislative effects. (Again, one might agree with sending books to children’s homes, but one should not conclude from the correlation studies that the books might make the kids smarter.

  50. Sam, when Person Y says “correlation does not equal causation” to Person X, it’s above all Person Y who is made stupider. Person Y is made stupider because he responds with a slogan rather than something substantial. I believe butter is good for us. If you eat margarine, it’s bad because it prevents you from eating butter. In your two examples, let me suggest that in the first case the writers knew perfectly well that correlation doesn’t prove causation and in the second case the Governor’s idea had a lot to do with using those books as campaign literature. I’m sure you act on imperfect inferences every day — do you KNOW the supermarket will be open before you go there to buy something? So it isn’t clear why others shouldn’t do so.

  51. There is a simple phrase for these sayings: Rules of Thumb. Not always true, but most often worth remembering and considering. These “principles”—I consider “principle” an apt word because they provide a reasonable starting point for analysis—do not cease being exceedingly useful simply because they are occasionally wrong.

    After all, “there’s an exception to every rule,” right?

  52. Seth–

    Now hang on a second. Sam just brought up two perfectly legitimate instances where correlation/causation confusion led directly to bad public policy, in bike helmets, and in early childhood education. Your reply, unless I’m totally misreading it, was that neither action was the result of science, just politics: The bike helmet study authors knew they were being misleading, the the governor knew he was doing something worthless but it looked good for the voters.

    On the one hand, sure, that’s the real world, we don’t necessarily have policy based on science. On the other hand, you’re not exactly being a neutral advocate here. In fact, you seem to be advocating this baldly unscientific behavior. But maybe I’m being presumptuous. So let me put it to you very plainly:

    From your perspective as an apolitical scientist, should raw correlation be used as a guide to public policy, as in the two instances Sam Fent described?

  53. I’m sorry, Seth, but you would fail a logic class.

    2 is correct.

    Your counter claim is wrong, you’re commiting the classic logical fallacy of confusting predicate and proposition: Zero is a predicate not a proposition. That is, you cannot talk of the ‘zero’ of something, because, if it’s zero then their is no ‘something’ there.

    So zero correlation does not equal zero causation – because zero correlation means that there is no correlation at all! Zero correlation equals zero. So it does not exist, so it cannot even enter into a argument. If there is zero correlation then the right hand side of this equasion is simply blank! If there is no correlation then there is nothing to be said about correlation. If two things aren’t correlated, then the question does not even arrise as to wether or not they are caused.

    Consider another example: if i have two non-existant apples, and i polish them what do i have? Two non-existant shiny apples? No – you never had any apples to begin with, coz , they dont exist! Thus if their is ‘zero correlation’, which is another way of saying there is a ‘non-existant correlation’. Then an ‘non-existant correlation’ is the same as ‘non-existant causation’. Well duh! ‘Non-existant correlation’ is also exactly the same as ‘non-existant magic elves’, and the same as ‘non-existant santa clause’. Ofcourse two things that are dont exist are are going to be equal. All you’ve proved is that zero equals zero.

    Look let me spell it out in an argument form example:

    Argument A:
    Correlation is not the same as causation
    But no correlation is the same as no causation
    Therefor, correlation is exactly the same as causation

    is the same as:

    Argument B:
    Marriage is not the same as Happy Marriage
    But not being Married is the same as not being Happy Married
    Therefor, being married is exactly the same as being happily married.

    The Argument B is clearly wrong and loopy, and your Argument A is loopy because it commits the same logical fallacy.

    To use a more common analogy: it’s like you divided by zero.

    As for 3, “The plural of anecdote is not data.”

    Depends, the word data could have two different meanings here. Both of which you confused. By data you might mean information, in which case 3 is right. But really data is different from information. But when you recognise the distinction, then your criticism is true but irrelevant.

    The word ‘Data’ comes latin, and means ‘to give’. It is what is ‘given’ for free. Data, or what is given, has to be worked on and refined into ‘Facts’. Fact from the latin ‘Fac’ meaning ‘to make’ (as in factory and manufacture). In computer science Data is just a collection of meaningless points, then it has to be refined and processed into Information (or facts). BUT only information (or facts) can form the evidence for an argument. Where as, data is meaningless and valueless it should never be used as evidence for an argument.

    So when you claim that people can learn things by reasoning from the data, this is flatly wrong…

    The data ‘the given’ stuff first needs to be examined to discover if it is of any worth or value, and to say what that worth or value is. Data is value neutral, it can be either good or bad, useful or worthless; it is all still data. When you have discouted the data that is rubbish, and analysised the remaining data according to it’s worth, then it has become Information. INFORMATION CAN the be used an evidence in an argument to reach conclusion. DATA should NEVER be used as evidence for anything.

  54. Dan, you ask, “should raw correlation be used as a guide to public policy, as in the two instances Sam Fent described?” Sure, in some cases. For example, the idea that smoking causes lung cancer is/was mostly based on “raw correlations” (if I understand what you mean by that). It would have been a bad idea to wait for better evidence of causation. When making public policy, I think it’s a poor idea to ignore data — which means not ignoring correlations. Should a single correlation of 0.10 be used to guide the spending of a billion dollars? Of course not.

  55. Privileging zero correlation as a reliable indicator of the absence of a causal connection is dangerous because we are often interested in systems with feedback.

    Think about a room with a thermostat. Perhaps there is broken cloud and the room is intermittently warmed by direct sunshine. If the heating is on, the thermostat turns it down when the sun comes out. It zeros out the correlation between room temperature and direct sunshine by creating a cancelling inverse correlation between direct sunshine and heat from the heating system.

    On might argue that the thermostat can only weaken the correlation, there must be some change there to drive the operation of the thermostat. Not so fast. If the sunshine is hitting the thermostat we may well find the room cools down when the sun comes out and shuts down the heating. Feedback can drive the correlation down, down, down, right through zero and into confusingly negative.

    Now think about the kinds of things we argue about. Consider, for example, a mature social democracy with taxes that have, for many years, wobbled about the Laffer Curve tipping point. The political system provides feedback that tends to null out the correlation between tax rates and tax revenue. Does zero correlation provide evidence that they are not causally connected?

  56. Seth, I find it a little bit bizarre that you regard “correlation is not causation” as a slogan intended to make you go away, rather than a core lesson from epistemology encouraging you to think more carefully.

    If I may restate your views, you correctly state that there are four possibilities for the relationship between variables A & B: (1) A causes B, (2) B causes A, (3) A & B are caused by C, and (4) A & B are not related.

    If we discover that A & B are correlated, you view this as reducing the likelihood of (4), and therefore necessarily increasing the likelihood of (1)-(3). This is false. A & B are just as likely to be unlinked as they were before and you’ve improperly updated your priors.

    There are plenty of robust correlations (think super bowl winners and election results, cyclical & counter-cyclical stocks, etc.) between variables that are, in fact, unrelated. Modern social science tools like GSS make it almost trivially easy to hunt for such correlations.

    Suppose someone told you there is a very robust correlation between holding a college degree and adult income level (this is a true statement). Do you therefore conclude that awarding twice the number of college degrees next year will substantially increase the lifetime wealth of the recipients? I would hope not, and anyone who so argued would indeed need to be reminded that “correlation is not causation”.

    The bottom line is that if you want to show causation, you need to show it (c.f. Mill’s Methods). The fact that no correlation is evidence for no causation does not imply the converse.

  57. SkepticalCynic, you write: “If we discover that A & B are correlated. . . A & B are just as likely to be unlinked as they were before.” I disagree. I can think of hundreds of cases where the correlation between A and B reflected a linkage between them (linkage = A causes B, B causes A, or both have a common cause). That you can cite a few examples where you claim that A and B are correlated but none of these three cases is true doesn’t make my hundreds of examples go away. Throughout scientific history, people have observed correlations and used them to suggest ideas that were then tested in various ways, such as experiments. Many of these ideas turned out to be true. One example is the idea that smoking causes lung cancer. It began with a correlation. Another example is the idea that maternal lack of folate causes neural tube defects in the fetus. It began with a correlation. A third example is the theory of continental drift. It began with a correlation.

  58. In regards to the Absence of Evidence is Evidence of Absence.
    I ran into this earlier and nearly fell out of my chair.
    While the logic is right the interpretation is false
    The logical conclusion is
    Probability of existence given evidence is greater than the probability of existence given absence of evidence.
    which of course is TRUE.. but this doesn’t suggest that
    Probability of existence given absence of evidence is equal to ZERO.
    So ultimately this proof just says that absence of evidence is evidence of probable absence. Which is like saying “I haven’t seen one, so maybe it doesn’t exist” which is an unremarkable conclusion

Comments are closed.