Seth Roberts on Orwell's "preventive stupidity"

Seth Roberts, a professor emeritus of psychology from UC Berkeley and a self-experimenter, discusses three popular sayings about data.
"Absence of evidence is not evidence of absence." Øyhus explains why this is wrong. That such an Orwellian saying is popular in discussions of data suggests there are many ways we push away inconvenient data.

"Correlation does not equal causation." In practice, this is used to mean that correlation is not evidence for causation. At UC Berkeley, a job candidate for a faculty position in psychology said this to me. I said, "Isn't zero correlation evidence against causation?" She looked puzzled.

"The plural of anecdote is not data." How dare you try to learn from stories you are told or what you yourself observe!

Orwell was right. People use these sayings -- especially #1 and #3 -- to push away data that contradicts this or that approved view of the world. Without any data at all, the world would be simpler: We would simply believe what authorities tell us. Data complicates things. These sayings help those who say them ignore data, thus restoring comforting certainty.

Preventive Stupidity Exists

76

  1. But the plural of anecdote isn’t data. It’s “anecdotes”. How can we trust someone who won’t even check a dictionary?

  2. I’ve always found the correlation one an especially frustrating saying, because it’s almost always used as an excuse to deny uncomfortable conclusions.

    I try to follow such arguments by saying that while correlation does not equal causation, it does jab you in the ribs while shouting “wink wink nudge nudge there’s something interesting going on here.” It never works, though.

  3. Come on. No one actually ever says any of these things.

    Re #1. This “saying” is obviously true in the general case, but just as obviously in many specific cases absence of evidence is an important finding that can be considered evidence in itself. Since it all comes down to cases, what’s the discussion about?

    Re #2. A stupid faculty candidate does not make for a good argument. Obviously the truth of this “saying” depends on the context, how well one trusts the data and its dimensions, whether one believes there may be external factors not modeled in the data, and so on. If the data is sound and reasonably complete, then obviously correlation is necessary to establish causation, but is not sufficient to imply it, and hence zero correlation does imply zero causation. In other words, if P->Q and -Q, then it must be the case that -P.

    Re #3: It’s still true, though. We all learn inductively through accumulation of anecdotes, but anyone who makes a statistical or scientific claim based on anecdotal evidence deserves not to be published.

    1. You hit the nail on the head. Correlation is a necessary condition to show causation, but it is not sufficient.

      If I come upon two vehicles wrecked in the middle of the intersection, that’s a correlation between them — they were both involved in an accident. But it doesn’t necessarily mean that they hit each other. A third vehicle could have hit both of them and driven away in a hit and run (common cause outside the scope of the evidence). Or one of the vehicles may have been damaged at some other time and coincidentally broken down in the intersection either before or after a hit and run accident (distinct causes resulting in similar effects).

      And even if the two vehicles did hit each other, you still don’t know which driver was at fault. Did A cause the accident or did B?

      Whenever I read a mainstream news article publicizing the results of a scientific study, I assume that they’ve reversed the cause and effect. Just as a noise filter to block out the implications that the reporters usually make to sensationalize the results. Sometimes it’s just as logical to conclude that “semicolon cancer reduces vitamin X absorption” as “vitamin X prevents semicolon cancer”

  4. The linked article by Kim Øyhus lists “CO2 from machines causing global warming” alongside “fairies, trolls, and ghosts” in the absence of evidence category. Interesting.

    1. Yes, botono9, it’s easy to get on Seth’s good side by expressing Global Warming Denialist tropes. Nothing is too spurious to be repeated.

      It’s too bad, because Seth does very interesting personal experiments, with very interesting results. On anything that doesn’t involve a direct personal experiment, he embarrasses himself badly.

  5. It should be noted that Seth routinely censors (hand-edits, or simply eliminates) comments that disagree a little too directly with his more extreme opinions.

    Those this case are so absurd, he must be hard pressed to keep up with the load.

  6. How about strawmen? Does he have any examples of those? Besides all of his own arguments in the article, I mean.

  7. I agree totally that the first one is bunk, and am delighted to see a proof of it.

    As for the 2nd and 3rd one… I’ve seen them used to push away fallacious arguments more often that I’ve seen them used to ignore valid data. I’m not sure I understand what his problem is with them. Particularly since people tend to put too MUCH stock in correlation when causation has not been established, and anecdotal data, not too little.

    1. No, Matt, all three are of a piece. To say “there’s no evidence for X” when in fact no work has been done on X demonstrates nothing. I have no evidence for your biological existence, but that doesn’t mean you don’t exist.

      It’s normal for scientists to hate #1, psychologically, because they hate not knowing. That’s why they became scientists in the first place, to find things out. In the absence of evidence one way or another, you just have to admit you don’t know.

      1. rootboy summed it up pretty well. The statement gets abused a lot by people confusing the terms “evidence” and proof. The statement “absence of proof is not proof of absence” is also true. But absence of evidence is evidence of absence. It may not always be good evidence, but it’s evidence nonetheless.

      2. I disagree. As a scientist, I love (well, I’m not “in love with”) the the first rule (which I’ve always heard phrased as “Absence of proof is not proof of absence.”) To me this rule (along with the second) says that an experiment should have a negative and positive control whenever possible. The third rule implies that experiments should be repeatable.

        None of these statements should be used to win an argument, I agree. But a logical argument with a basis in one of these statements, one grounded in facts and data, should certainly be taken in to account.

  8. His argument is nothing but incensed hyperbole. Any adage, taken to extremes, breaks down into absurdity.

    I wouldn’t be surprised if he also dislikes “a bird in the hand is worth two in the bush,” because you can have a bush with ten birds in it, but you only have two hands!

  9. All of these statements are correct, when used appropriately. In fact, they are vital to scientific and statistical methodology. The problem seems to be people misrepresenting what they mean, not with the accuracy of the statements themselves.

    I said, “Isn’t zero correlation evidence against causation?” She looked puzzled.

    I’m glad I’m not facing these silly “gotcha!” questions in an interview… Correlations only detect linear relationships. Many relationships are non-linear, ie, at low levels one thing happens, at high levels, the opposite happens. These relationships may be causal, but because of their complex nature, they show no linear correlation. A brief, somewhat silly example: can oxygen allow one to continue living? At low levels, no. At moderate levels, yes. At high levels, no again. There is no linear correlation there. Therefore, according to this logic, oxygen does not have a causal relationship to living and breathing. In reality, zero correlation is not evidence against causation at all.

    1. Christovir nailed it: the Pearson Correlation can only summarize LINEAR statistical relationships between two variables. It’s possible there could be a very strong nonlinear relationship (e.g., logarithmic, exponential, S-shaped, U-shaped, sinoidal) but a correlation coefficient of zero.

      At any rate, whether such a relationship (if any) is CAUSAL is an entirely different question, and one that correlation alone cannot illuminate.

  10. The proof in #1 is an abuse of logic. Not in that the proof is wrong – it’s an abuse of what logical proof means. It shows that the argument is valid, not true. Absence of evidence *can* be evidence of absence, so the proof is valid. He is using the proof to say that absence of evidence always means an evidence of absence in a snarky way, when you have no real way of knowing because you can’t prove a negative.

    You can’t make a logic proof of what the linked to page says. Even if logic were a complete science you still could not do it.

    1. kcObbq I couldn’t agree more with you.

      Here’s a little illustration:

      Bob walks across a minefield and he gets lucky and nothing happens. Does the absence of explosions mean that there were no land mines? No it only means there were no explosions.

  11. So, we can cure diabetes by outlawing insulin, since they’re correlated, and UFOs exist because there’s lots of anecdotes?

    No.

    Correlation does not prove causation, not on it’s own. That the opposite may be true, a lack of correlation proves a lack of causation does not change that fact. The author may need to review the logical concept of the “contrapositive”

    As to “plural of anecdote. . ” the morale isn’t that you shouldn’t “learn from stories you were told or what you yourself observed”. It’s that you must check the facts first. This is why I believe in elephants but not the Loch Ness Monster.

  12. Wow, what an anti-science screed. While lack of correlation can indeed show that there is no causation, correlation only shows that two things are happening at the same time. I suffer from ill health, so I have a bit of extra time to read books. So reading books is correlated strongly to ill health for me, but reading books didn’t cause the ill health.

    As for anecdotes, that’s the reason we have science, individual experiences are often innacurate or downright wrong. You have to hypothesise and experiment are revise basic assumptions to find out what’s really going on. That’s where real data comes from, not common sense or ideology.

    If you really want to find out what’s going on, inconvient data is wondeful as it demostrates flaws in hypotheses. But neither “Correlation does not equal causation” nor “The plural of anecdote is not data” are ways of hiding inconvient data, in fact they are both ways of revealing faulty assumptions in everyday thinking.

  13. The first two were reiterated to me in my college statistics class… and by a professor who was passionate about the field, too.

    There’s nothing wrong with them if they’re not treated as truisms, right? They’re just a check against people’s desire to see positive evidence in scraps of negative evidence, when they want there to be positive evidence.

  14. While it’s technically true to say that absence of evidence is evidence of absence, what people mean when they say “absence of evidence is not evidence of absence” is actually: “the fact we didn’t find evidence doesn’t mean there is no evidence”. The former statement is just a cooler sounding way to state the obvious.

  15. Seth is technically correct but misses the point behind these.

    1) It would be better stated as “absence of evidence is not proof of absence”. I look for a bug in my program and fail to find it. I might not have looked hard enough, or made a mistake, so I haven’t *proven* that there’s no bug. But yes, it does decrease the likelihood that there is no bug, and after enough searching without finding anything the reasonable conclusion is that there isn’t one. But if all I’ve done is look once and not come up with anything it’s not valid for me to conclude with a lot of certainty that there isn’t one.

    2) I’ve always liked Edward Tufte’s “Correlation is not causation but it sure is a hint.” It’s a necessary but not sufficient condition. If you have strong correlation and a plausible causal mechanism, yes, causation is a reasonable conclusion. But we’re really good at finding correlations that are just random coincidences as well, and we need to be on guard against that.

    3) An anecdote is something that happened once – usually of the flavor “I took substance X and it cured disease Y” (or “gave me disease Z”). This is one data point. It tells you something; it’s value is not zero. Seth is right that throwing it out would be incorrect. But it just doesn’t tell you very much. Continuing as if you’ve demonstrated a general principle based on one measly little experience is incorrect.

    1. @ rootboy #19

      Agreed generally, except in your definition of anecdote: you say it’s ‘something that has happened once’.

      That is absolutely untrue. An anecdote is a story and in most cases nothing more – it should not be considered a data point unless it can be repeated or was actually observed.

      1. It would also be useful if the anecdote involved some objectively measurable facts about whatever is being reported, ie mass, volume, temperature, time, velocity, etc., etc.

  16. I am an agnostic science-y type, so do not take this as doubting the spirit of these arguments, but the Kim Oyhus’s explanation of 1. above doesn’t seem quite right. The conditional probability equation assumes that the probability that B is true due to the known evidence A is greater than probability for the unknown evidence contained in -A, as stated by definition 1. This is not necessarily true. For example in the fields where we’re still working out the details (e.g. quantum theory, etc.), there can be a mountain of data that we don’t yet know about supporting a theory. If this is true for a theory, that would flip the greater than to a less than in Definition 1 to a less than, and results in an end statement “the probability of the absence of B such that the absence of A is less than the probability of the absence of B such that A.”

    Also, the men and women and skirts and trousers example seems off too. If you see someone not wearing a skirt, is it more likely a man? Not necessarily, if there are way more women than men, then there can be more women wearing skirts AND more women wearing trousers.

  17. Come on, this comment thread is ridiculous. For example, anyone can understand that 1) correlation is necessary to show causation, but not sufficient to show causation, or that 2) conclusions arrived at by sound inductive reasoning can turn out to be wrong, or 3) that correlation is generally meaningful. Sayings like ‘correlation does not equal causation’ don’t convey these nuances because being pithy and short makes them vague.

    Some people really need to take a philosophy class, instead of just citing authority and/or insisting that their interlocutors are wrong.

  18. After reading up a bit more about Seth Roberts’ research, I am more confident that this is in fact just an ornery old crank being ornery. I am really not sure what he is having a rant about — these statements are open to abuse, yes (what isn’t?) — but disregarding them leaves one open to much, much broader intellectual abuses, like the “proof” using this “logic” at the link to show there is no global warming. He’s basically bitching about the scientific method because some people don’t understand it. The method is still sound, and we would be fools to disregard it.

    I teach statistics and psychology at the university level, and use statistics and the scientific method daily to produce peer-reviewed research. If one of my students used the arguments Roberts advocates, that student would not be pleased with their grade.

    1. Christovir, it is precisely Seth’s cranky orneriness that makes him interesting. While usually laughably wrong on anything outside his immediate field, he nonetheless comes up with apparently sound personal experiments that yield original and surprising results. How is cranky orneriness compatible with producing interesting results? The conventional answer is “dishonesty”, but doubt that in this case.

  19. I think he’s misunderstood almost all the sayings.

    1 – the linked proof does a much better job of getting to the way it’s used in practise

    2 – “Isn’t zero correlation evidence against causation?” Yes, but the saying is used in the opposite direction, that there are many things which are correlated without a direct causal relationship.

    3 – This is generally used as a call for a more structured understanding of the subject. Stories and anecdotes are the beginnings of a path of research (along with gut feelings and 4am epiphanies) but can’t be taken as solid proof of general phenomena

  20. oops @ 22. 3.) Should have indicated that you need to have some idea of a mechanism that causes the correlation.

  21. Posting “correlation =|= causation” in response to everything you read online may not be strictly incorrect, but it does indicate that you haven’t bothered to think about the issue at hand and won’t be adding anything useful to the discussion.

    Antinous: We should publish a list of memes too tired to be used: quotes from Orwell, correlation =/= causation, the plural of anecdote is not data. They’re the geek versions of “God made Adam and Eve not Adam and Steve”

  22. Ironically, suggesting that correlation might be evidence of causation always causes people to pipe up and say “Correlation is not causation.”

  23. “Absence of evidence is not evidence of absence.”

    Well, this is true. Unless, of course, you are willing to entertain the thought that not only does not a falling tree not make a sound when there’s no one to hear but that the trees do not fall unless there is someone to see it. Or that there *literally* weren’t any black swans until one popped up due to quantum uncertainty just in time for someone to find them.

    Of course, applying the saying to any “full knowledge” or “full spectrum” situation, which some forms of statistics are up to within reasonable probability and sampling, is simply stupid. It is about logic, not statistics.

    “there are many things which are correlated without a direct causal relationship”

    Like the well-known correlation between consumption of ice cream and people drowning.

    “The plural of anecdote is not data.”

    I suppose this depends on one’s point of view. Technically all is data, but not relevant or reliable. Random noise is data too. Often just not very useful. Unless you are studying random noise. And even then you are trying to find some sort of internal structure, so you filter out the randomness.

  24. it does indicate that you haven’t bothered to think about the issue at hand and won’t be adding anything useful to the discussion.

    They’re semantic placeholders for actual thought or argument.

  25. Having read the link to Kim Øyhus, I am now convinced that since all herring are fish, and all fish live in the sea, if I buy kippers it will not rain on monday.

  26. I like Seth and his writing but this article is poor.

    Equating anecdotes with personal research is wrong. We shouldn’t trust anecdotes because they are often biased towards what’s memorable. Good personal research would eliminate bias and so be useful for an individual.

  27. This kinda reminds me of that old national Lampoon headline:
    “Stupidity In Crisis: What’s Going On Here? A Full Report”

  28. He’s correct about absence of evidence. The second one is sort of correct. Very often things have a correlation and you can’t just use the correlation != causation mantra unless you have another hypothesis to explain the correlation. The last one is the one I disagree with the most. It is false in the sense that anecdotes are evidence. But they are incredibly weak evidence. First, they are not randomly selected. Second, they suffer from massive observer bias and observer recall problems. The reason we rely on statistics rather than anecdotes is to deal with these problems. Thus for example, it is fine for a doctor to hear about an anecdotal cure for some illness. But they better then do a real statistical test to verify the claim. Simply hearing more anecdotes doesn’t cut it.

  29. I was going to chime in here but reading the comments over it’s clear this cacophony doesn’t need one more ding dong. Well done, all.

  30. “The plural of anecdote is not data” is a way of saying you might not be getting a good cross-section of experience just by wandering around and living your life.

    You see it in comments here all the time: “Well, I don’t do it, and none of my friends do it either, so you’re crazy when you say a lot of people do it!” The trouble is that “you and your friends” are, sort of by definition, not a representative sample. Y’all probably share certain interests, your ages may be close, you’re probably geographically clustered, you may be correlated in income or occupation, etc. A good data set will either be large enough to get a truly representative sample or analytically compensate for the bias.

    So, yes, anecdotes and data are fundamentally different. You may be able to extract data from anecdotes — linguists do so all the time — but you have to do it carefully. The reverse is also true.

  31. Seems to me that this untypably pseudonamed person is failing to understand how “absence of evidence” is even different from “evidence of absence”.

    If I never look up and declare there are no stars I am an fool because I have no evidence of their absence, I have only an absence of evidence.

    If I look up from time to time and never see any stars I have evidence of their absence. Repeated observations that failed to find any stars, but could be expected to. This is evidence of absence. It might weak due to my ignorance (maybe I only looked up during the day because I knew no better and only half a dozen times), or strong (I looked up from various mountaintops on clear nights every night for 50 years) but either way it is evidence.

    Each observation is evidence. Each observation is evidence of absence, evidence of presence, or inconclusive. The only way to have an absence of evidence is to not observe at all or get only inconclusive observations.

    If you redefine “absence of evidence” to mean “failures to find evidence for even when you do look for it”, yea the old saw becomes false. It’s pretty easy to turn sense into nonsense that way.

    Similarly the anecdotes vs data statement doesn’t say anecdotes are useless it says carefully collected data is better. If I say the median family income in the US is 50k based on professionally collected government data and you say that can’t be true because both you and I and some guy you met Tuesday make more I won the argument. If you went out and did a careful survey and came to a different result we could discuss which data was better.

    1. Most people seem to agree he got at least #1 right, but I’m afraid not.

      “Absence of evidence is not evidence of absence” means that if you have never collected data about something, then you can’t use that as evidence it doesn’t exist. Absence of evidence for life existing in distant parts of the universe isn’t evidence that it doesn’t exist. We just don’t know.

      It doesn’t mean that if you *have* collected data and don’t find anything, you can’t use that as evidence against it. You can. Every example Øyhus gives is of this kind, WMD in Iraq in particular – not global warming though, obviously.

  32. This is my first comment been reading a while, but never felt like saying anything.

    1st this article is beautiful. Mainly because a lot of intelligent people let the fact that they do know something persuade them that they can’t be wrong. Thus when someone says something they do not have any knowledge of or happens to threaten to disrupt their view, they fall back on these and similar phrases.

    2nd I see a lot of comments that only prove the point, and most of them are arguing against it. If you can not boil down your point into a simple cohesive statement that even a mediocre 6th grader could understand you are wasting bandwidth and only trying to make people think you are smart. It shows you lack a workable grasp of a subject, and are covering up for it with regurgitated book knowledge.

    I applaud Jeff for boiling down the issue and speaking clearly and showing a working grasp of the subjects.

  33. 1st this article is beautiful. Mainly because a lot of intelligent people let the fact that they do know something persuade them that they can’t be wrong.

    The second part of this sentence is certainly true – and I believe that it is exactly why Roberts is so confident yet so incorrect.

    If you can not boil down your point into a simple cohesive statement that even a mediocre 6th grader could understand you are wasting bandwidth and only trying to make people think you are smart.

    Unfortunately, some things are more complicated than a sixth grade education can provide. That is why we go to school beyond the sixth grade. I agree concepts should be expressed as simply as possible, but after a point, simpler = wrong.

  34. The point of the statement “The plural of anecdote is not data” is that statistically significant trends generally cannot be found in the experience of a single person. Not to mention the problem of confirmation bias. It’s easy to remember the times when X follows Y, but harder to remember the times when X doesn’t follow Y.

  35. Ok- It’s great to try to dispel ignorance, but if you follow the links- this guy tries to group global warming with alien crop circles, and debunk them in the same breath as “ignorant” perspectives. This kind of articulate smoke and mirrors is just dangerous ignorance of another kind. The evidence is there for man-made climate change- it’s not open for debate any more than the fact that clouds are made from water vapour… I thought BB would be a little more careful in pushing the agenda of ignorant deniers like this.

  36. That was one of the most vacant articles I’ve read in a long time, he seems to deliberately warp what each of the sayings actually means in common usage, applying a special definition to each which he can then demolish and feel smug about. Everyone’s already done a good enough job of taking the three points apart which is a relief, must say I did laugh my balls off at the inclusion of CO2/global warming in the linked article on point one though.

  37. It is true that zero correlation is evidence against causation. However, it does not follow from this premise that nonzero correlation is evidence for causation. A implies B does not mean that not A implies not B.

    The plural of anecdote is NOT data. Data is rigorous. Anecdote is random. But in between lies observation, semistructured and containing hints about how to obtain data.

    Kim Øyhus is wrong as he assumes that absence of evidence is the same nonexistence of evidence. This ignores the possibility that evidence may not have been looked for yet, or may not have been looked for in the proper way.

    When one is trying to prevent stupidity, one should avoid being stupid oneself.

  38. “Absence of evidence is not evidence of absence.”

    A proof done with conditional probability, in modal logic, which determines whether something is likely – and invalidates the particular phrasing of the truism (is/is not, versus might/might not in modal logic), while ignoring the phenomenon the phrase was coined for and which it actually addresses: Simply because one has not demonstrated successfully a proposition does not mean the proposition is necessarily false – and that’s simply not a statement of modal logic. Absence of evidence for black swans in a particular locality is not evidence of the absence of black swans in the past, future, or other localities.

    The phrase ought to be “absence of evidence is not proof of absence”, but the coined phrase sticks in the human mind.

    “Correlation does not equal causation.”

    “Zero correlation is evidence against causation”

    No. Zero correlation is an absence of evidence. Simply because one has not demonstrated successfully a proposition does not mean the proposition is necessarily false or even likely to be false. It is solely a failure to demonstrate. The methodology could be at fault. The experiment could be an anomaly. There could be a hidden variable missed by the experimenter. The experimenter is tempted to take a lack of black swans to mean that all swans are white, but it does not demonstrate as true or even likely that all swans are white.

    “The plural of anecdote is not data.”

    In science, this is true. Anecdotes are stories from other people (who have hidden variables called “motivations” as well as the faults of one’s own self) or from one’s self (who have hidden variables called “fallible senses, models, and possibly faulty reasoning processes”).

    An anecdote — or many — might be useful for the formation of an hypothesis, but then one designs an appropriate experiment with controls and gathers data to test that hypothesis. Unless you’re testing an hypothesis about a quality of “anecdotes”, they are not data.

    Last note:

    “Zero correlation is evidence against causation”

    Let’s say that someone is accused of a murder. He cannot account for his whereabouts on the night of the murder. No physical evidence exists to implicate him – the body and surroundings are perfectly clean as far as forensics can determine at the present time. He has zero correlation for being away from the murder, zero correlation with not being present for the murder, zero correlation for not being the murderer.

    This is not evidence against him being not-the-murderer (evidence for him being the murderer). This is not evidence against him being not-present-for-the-murder (evidence for him being present for the murder). It’s not evidence for him being away from the murder (evidence for him being present).

    It is also not evidence for him being absent from the murder scene. Not evidence for him being the murderer.

    It is merely a failure on the part of the police, the district attorney, the forensics, the court to demonstrate /anything/ positive about the man in question at that murder location in that time period.

    He /could/ be the murderer, he /could/ be not-the-murderer.

    Even if evidence of someone /else’s/ presence were to be found in the murder scene, /that/ is not (necessarily) evidence of them being the murderer. Even if they have blood spatters on them, have their fingerprints on the knife, footprints in the carpet with dirt from the garden and footprints in the dirt in the garden. Even if they’re not the resident of the house. Even if they had a grudge. Even if they swore a blood oath. Even if a neighbor points the finger at them. Even if the police and district attorney demonstrate persuasively means, motive and opportunity. Even if the defense lawyer is incredibly inept and/or seems to be relying entirely on theatrics.

    People get framed for crimes — sometimes by the criminal, sometimes by the police/DA who are often running under modal logic (“Most likely suspect”!).

    One has to ask one’s self “Is there no other reasonable explanation?”. Has it been demonstrated not merely upon a preponderance of the evidence, but /beyond a reasonable doubt/ – ?

    1. Zero correlation is an absence of evidence.

      Very nice! Now that is what a proof looks like.

    2. I agree with almost everything you said except on the first point: “Absence of evidence for black swans in a particular locality is not evidence of the absence of black swans in the past, future, or other localities.”

      “Absence” means “to be not present” or “to not exist”. So when we use the word “absence” in its true sense, it will always automatically apply to the present space and time, regardless of future or past. Absence of evidence for black swans is evidence of absence of black swans.

  39. Sorry, that should read “It is also not evidence for him being absent from the murder scene. Not evidence for him being /not/ the murderer.”

    I got lost in the flip-flip-flip of mutually exclusive alternate choices.

  40. I can understand what he’s trying to say with #1 and #2, but his objection to #3 has a serious problem. It’s that there are 2 huge things stacked against our ability to know the truth through simple personal experience: Flawed perceptions and selection bias.

    Before you dismiss flawed perception, go research the reliability of eye-witness testimony and the malleability of memory. We absolutely SUCK at understanding and processing what we experience and constructing accurate models based on them. Really. It’s more likely that any memory you have is more wrong than it is right, no matter how accurate you THINK it is.

    Selection bias forces us to think that what happens to us in somehow common, even if we know it isn’t.

    Our brains are in no way required to get things right, only just right enough to not die (poison plant, man-eating beast, location of cliff etc.). That includes an enormous amount of false information. We just aren’t adapted to create correct assumptions.

    These glib (and occasionally inaccurate) sayings are just shorthand ways of pointing out common errors in the ways we judge evidence. They are only useful if the people using them actually understand what they expand to mean. They only being to sound Orwellian if repeated like a mantra or a magic spell, devoid of meaning and thought.

  41. Regarding #3… I find it interesting that the only version of that one that I’ve ever heard has the last word as “facts”.

  42. It would be nice if he formulated a cogent argument against these sayings, instead of just providing us with negation.

    -RTM

  43. Having read a little around Seth’s blog I feel that the recent shout-out to the Dunning-Kruger effect applies here.

    Also his use of a single anecdote to justify point 2 (and later claim it as ‘data’ in the comments) seems to contradict his point 3. It may be a pithy little list with some sensible points but it’s not the masterpiece of logic he makes it out to be.

  44. “Correlation does not equal causation” can cut both ways: it can be just as bad to automatically assume that because two things are correlated, one causes the other (for example, “frogs with no legs are deaf”). In both cases, critical thinking is, well, critical to avoid coming to a faulty conclusion.

  45. Statement #4: Psychology is not a hard science, even though psychologists think it is.

    1. I’d say it’s part of a wider doctrinal/generational conflict within psychology. I know neuropsychs who practice hard science, and I know social psychs in the same department who wouldn’t know it if it bit them in the ass.

      1. Yeah, I was probably doing cognitive science a disservice by putting them on the same pile as the Freudians.

  46. I’ve always understood “Correlation does not equal causation” to mean “‘correlation’ is not a synonym of ‘causation'”.

    And changing track slightly, ponder, for a moment, the statements “miles do not equal kilometres”. “But doesn’t zero miles equal zero kilometres?”.

    (Yes, I know this not directly comparable with the question in the original article, but the distinction is left as an exercise for the reader :) )

  47. Seth’s blog makes me want to cry while my head explodes…this is why many scientists giggle a little when psychologists call their field ‘science’ (fMRI studies aside…although even those need better resolution to be convincing IMHO).

    OK, in the interest of time I’ll just go over point #3 (although points 1 and 2 are equally wrong):

    Data does not just refer to a series of points, but rather to the entire group as a whole. Furthermore, data sets are only useful when they are collected in an unbiased and representative manner.

    If you want to talk about single people or events, go ahead and use your anecdotes. However, if you want to talk about generalities, anecdotes are worse than useless. Without a set of representative data, you can’t use statistics, and you have no idea whether your single or multiple points are even remotely reflective of reality as a whole.

  48. So many people here seem to have such a difficult time grasping the idea that just because there is evidence of A does not mean A is true with absolute metaphysical certainty. I think that’s why you’re having such a hard time wrapping your head around Øyhus’s proof.

    Yes, you are correct that when nobody had seen black swans yet, that doesn’t mean they didn’t exist. But it did mean that there was strong evidence they didn’t exist – such as the fact that nobody knew of any evidence they did exist! Just like the fact that there has never been good evidence that unicorns exist does serve as good evidence that they do not.

    It doesn’t mean that unicorns absolutely cannot possibly exist, it just means it’s unlikely given the evidence.

  49. Re #2: Zero correlation might be evidence AGAINST causation, but the converse – “correlation is evidence FOR causation” – doesn’t follow, does it?

    To think so would be a basic logical failure on Mark’s part, because “If not-A then B” DOES NOT IMPLY “If A then B”. Maybe she was confused because you were asking weird questions?

  50. Crap: I meant that “If not-A then not-B” DOES NOT IMPLY “If A then B”. Logic ain’t all that, anyway.

  51. Wow, what a silly, misinformed, anti-science, global-warming-does-not-exist screed.

    The “Absence of evidence is not evidence of absence” argument linked is just being used to try to show that CO2 does not cause global warming, but does not succeed in this. (The Bayesian inference discussion, on the other hand, is justified.)

    The “Correlation does not equal causation” ‘gotcha’ question is like a 12-year-old thinking they have found a fatal flaw in logic. For the reasons outlined many, many, many times above, the fact that absence of correlation may imply absence of causation does not even remotely suggest that correlation implies causation. Someone failed Logic 101 and forgot how to make the converse of a statement.

    And the suggestion that the plural of anecdote might actually be evidence is the same belief that leads millions to believe that homeopathy is true because they once heard that someone’s aunt’s arthritis got cured by a dilution of snake’s tongue. Just today there was a good NY Times article on allergies, showing very strong scientific evidence that the majority of people who believe they have food allergies don’t actually have them. But if you click on over to the comments, what do you see? “This article is sh*t. I really do have allergies! What do these “scientists” know??

  52. Øyhus’s stance is especially tiring. While her proof is correct, and in most sciences and law a distinction is mad ebetween “evidence” and “proof” (conclusiveness being the difference), “evidence” is also commonly used, is common parlance but also professionally, with the sense “proof”, as is demonstrated by the common phrase “conclusive evidence”.

    She then proceeds to conclude that “faith is bad”. Even if, as she would, she maintains that “bad” is not here intended as a moral judgement but rather is shorthand for “objectively detrimental”, she is overapplying logic to human society. At best, she could conclude that faith is not logical, which is the point of including the obligatory “Data” character in scifi. But “faith is illogical” is paramount to saying “faith is faith”.

    Among the faulty suppositions resulting from the adage “absence of evidence is not evidence of absence” she then lists “CO2 from machines causing global warming”. Now her English here is somewhat confusing, but unless she is referring to a conspiracy theory claiming there’s machines somewhere with the specific aim of causing global warming, she is here taking issue with anthropogenic global warming. It is, however, no article of faith that “machines” burning fossil fuels increase the overall CO2 in the atmosphere, nor that an increase in atmospheric CO2 leads to warming of the planet. This is basic physics, while the actual process has been documented with overwhelming “evidence”, conclusive by induction (would have been deduction if climate hadn’t been such a complex system). It seems the author falls prey to the faith she so despises.

  53. Wait- I’m confused. Does this mean the correlation
    between the decline of pirates and the increase of global warming does or doesn’t prove that global warming is caused by the decline of pirates ? I have to redoublethink this some more!

  54. These are all warnings against (at least) confirmation bias.

    They also have nothing to do with some global power conspiracy.

    In fact, these rules can help dispel crappy Fox News-style “statistics”.

    Orwellian? Really? Why? Because it uses words?

  55. Mistaking correlation for causation is actually a cause of a great deal of human misery. Two examples in the blog comments were racism (“people of ethnicity X commit more crimes than people of ethnicity Y so having a certain colour of skin must make you bad by nature”) and Cargo Cults.

    Suppose I said that “You should wear a hat when you go out if it’s very cold” is a bad thing to say because almost any time I said that to another reasonable adult it would be condescending and a waste of breath. That would be true, but the statement is still correct. Just because “Correlation does not equal causation” might not often be the most useful thing to actually say, the sentiment is extremely correct and the world would be a better place if everyone understood it.

    And frankly, arguing that absence of evidence *is* evidence of absence is offering another weapon to the Birther movement.

    People who are saying these statements are not useful have forgotten how ludicrously stupid the majority of arguments are.

  56. “The plural of anecdote is not data” is a toxic meme, and a beautiful example of unthinking pseudo-scientific dogmatism.

    I regard it as the triumphant cry of the know-nothing – “cite needed!” – when faced with a report from the field that contradicts a cherished shibboleth. We need not confirm or deny this report, it’s just an anecdote! No work (or thought) required!

    Logically it’s nothing more than sly ad-hominem – what Stephen Hawking says he observed or calculated is purest data, but what you say is crap, because your observations are crap, because you are crap. The value of what is said is being determined by who says it, or how it was said, without resort to actual reasoning or any search for evidence.

    In Real Science [tm] we can’t simply throw away reports or observations without investigation because we don’t like them, or we don’t like the person who said them. We gather reports and observations to guide planning of experiments and inform allocation of resources for further investigation, and we call these collected reports “data” — regardless of how good or bad the quality of the data is.

    Interestingly enough, the meme is apparently an inversion of a famous (and insightful) quotation! Here’s some data to support my contention:

    “I said ‘The plural of anecdote is data’ some time in the 1969-70 academic year while teaching a graduate seminar at Stanford. The occasion was a student’s dismissal of a simple factual statement–by another student or me–as a mere anecdote. The quotation was my rejoinder. Since then I have missed few opportunities to quote myself. The only appearance in print that I can remember is Nelson Polsby’s accurate quotation and attribution in an article in PS: Political Science and Politics in 1993; I believe it was in the first issue of the year.” (Raymond Wolfinger, 2003)

    “What is interesting about this saying is that it seems to have morphed into its opposite — Data is not the plural of anecdote — in some people’s minds.” (Fred Shapiro, 2004)

    Raymond Wolfinger’s brilliant aphorism “the plural of anecdote is data” never inspired a better or more skilled researcher. (Nelson W. Polsby PS, Vol. 17, No. 4. (Autumn, 1984), pp. 778-781. Pg. 779)

  57. Øyhus proves something, but I’m not sure what; his layman’s explanation doesn’t fit anything in the proof. In the “less proofy explanation”, he says:

    “A skirt is evidence of a woman, because
    there are more women than men wearing skirts.”

    If we take the proposition W as “the person is a woman” and S as “the person is wearing a skirt”, Øyhus must be saying:
    S is evidence of W when P(W|S) > P(~W|S)
    But upstairs in the definitions, we have
    “A is evidence of B when P(B|A) > P(B|~A)”. Taking the form from the definition back to the example [ P(W|S) > P(W|~S) ], that’s saying that wearing a skirt is evidence of being a woman because there are more women wearing skirts than women not wearing skirts.

    Suppose I’m at work, and the people I might see are (with equal probability) my one female co-worker who wears a skirt all the time, and nine male co-workers who never wear skirts except for two idiopathic Scotsmen who wear skirts all of the time. According to the form of “evidence” from the definition, if I see a skirt it’s evidence of the wearer being a woman because most of the women wear skirts (1 > 0), but although knowing the person is wearing a skirt makes it more likely that it’s the woman (33%) compared to the likelyhood of a random person being the woman (10%), it’s a strange kind of evidence that the wearer is a woman, because it’s more likely that the wearer is among the Scots (66%).

Comments are closed.