Missing Data in Clinical Trials: FDA Officials Refuse to Set Limits

People who believe in “evidence-based medicine” say that double-blind clinical trials are the best form of evidence. Generally this is said by people who know very little about double-blind clinical trials. One reason they are not always the best form of evidence is that data may be missing. Nowadays more data is missing than in the past:

By [missing data] he [Thomas Marciniak] means participants who withdrew their consent to continue participating in the trial or went “missing” from the dataset and were not followed up to see what happened to them. Marciniak says that this has been getting worse in his 13 years as an FDA drug reviewer and is something that he has repeatedly clashed with his bosses about.

“They [his bosses] appear to believe that they can ignore missing and bad data, not mention them in the labels, and interpret the results just as if there was no missing or bad data,” he says, adding: “I have repeatedly asked them how much missing or bad data would lead them to distrust the results and they have consistently refused to answer that question.”

In one FDA presentation, he charted an increase in missing data in trials set up to measure cardiovascular outcomes.

“I actually plotted out what the missing data rates were in the various trials from 2001 on,” he adds. “It’s virtually an exponential curve.”

Another sort of missing data involves what is measured. In one study of whether a certain drug (losartan) increased cancer, lung cancer wasn’t counted as cancer. In another case, involving Avandia, a diabetes drug, “serious heart problems . . . were not counted in the study’s tally of adverse events.”

Here is a presentation by Marciniak. At one point, he asks the audience, Why should you believe me rather than the drug company (GSK)? His answer: “Neither my job nor (for me) $100,000,000’s are riding on the results.” It’s horrible, but true: Our health care system is almost entirely run by people who make more money (or make the same amount of money for less work) if they exaggerate its value — if they ignore missing data and bad side effects, for example. Why the rest of us put up with this in the face of overwhelming evidence of exaggeration (for example, tonsillectomies) is an interesting question.

Thanks to Alex Chernavsky.

Modern Cargo Cult Science: Evidence-Based Medicine, Science Fiction in China

In a graduation speech, Richard Feynman called certain intellectual endeavors “cargo cult science,” meaning they had the trappings of science but not the substance. One thing he criticized was rat psychology. He was wrong about that. Sure, as Feynman complained, lots of rat psychology experiments have led nowhere, just as lots of books aren’t good. But you need to publish lots of bad books to support the infrastructure necessary to publish a few good ones. The same is true of rat psychology experiments. A few are very good. The bad make possible the good. Rat psychology experiments, especially those by Israel Ramirez and Anthony Sclafani, led me to a new theory of weight control, which led me to the Shangri-La Diet.

Cargo cult science does exist.  The most important modern example is evidence-based medicine. Notice how ritualistic it is and how little progress medicine has made since it became popular. An evidence-based medicine review of tonsillectomies failed to realize they were worse than voodoo. Voodoo, unlike a tonsillectomy, does not damage your immune system. The evidence-based medicine reviewers appeared not to know that tonsils are part of the immune system. Year after year, the Nobel Prize in Medicine or Physiology tells the world, between the lines of the press release, that once again medical researchers have failed to make progress on any major disease, as the prize is always given for work with little or no practical value. In the 1950s, the polio vaccine was progress; so was figuring out that smoking causes lung cancer (which didn’t get a Nobel Prize). There have been no comparable advances since then. Researchers at top medical schools remain profoundly unaware of what causes heart disease, most cancers, depression, bipolar disorder, obesity, diabetes and so on.

I came across cargo-cult thinking recently in a talk by Neil Gaiman:

I was in China in 2007, at the first party-approved science fiction and fantasy convention in Chinese history. And at one point I took a top official aside and asked him Why? SF had been disapproved of for a long time. What had changed?

It’s simple, he told me. The Chinese were brilliant at making things if other people brought them the plans. But they did not innovate and they did not invent. They did not imagine. So they sent a delegation to the US, to Apple, to Microsoft, to Google, and they asked the people there who were inventing the future about themselves. And they found that all of them had read science fiction when they were boys or girls.

I know about Chinese engineers at Microsoft and Google in Beijing. They want to leave the country. An American friend, who worked at Microsoft, was surprised by the unanimity of their desire to leave. I wasn’t surprised. Why innovate or invent if the government might seize your company? Which is the main point of Why Nations Fail. Allowing science fiction in China doesn’t change that.

Thanks to Claire Hsu.

Undisclosed Risks of Common Medical Treatments

Millions of tonsillectomies have been done, mostly to children. Were any of their parents told that tonsils are part of the immune system (taught in high school biology and known since the 1960s)? A Cochrane Review of tonsillectomies (the “highest standard” in evidence-based medicine) fails to mention that tonsils are part of the immune system.  A recent study found tonsillectomies associated with a 50% increase in heart attacks. (I write about tonsillectomies here.)

Are tonsillectomies unusual? Several recent news stories suggest no, they aren’t. Failure to tell patients the full risks of medical treatment may be common: Continue reading “Undisclosed Risks of Common Medical Treatments”

Assorted Links

  • Unusual fermented foods, such as shio koji (fermented salt, sort of)
  • David Healy talk about problems with evidence-based medicine. Example of Simpson’s paradox in suicide rates.
  • The ten worst mistakes of DSM-5. This is miserably argued. The author has two sorts of criticisms: 1. Narrow a diagnosis (e.g., autism): People who need treatment won’t get it! 2. Widen a diagnosis (e.g., depression) or add a new one (many examples): This will cause fads and over-medication! It isn’t clear how to balance the two goals (helping people get treatment, avoiding fads and over-medication) nor why the various changes being criticized will produce more bad than good. Allen Frances, the author, was chair of the committee in charge of DSM-4. He could have written: “When we wrote DSM-4, we made several mistakes . . . . The committee behind DSM-5 has not learned from our mistakes. . . .” That would have been more convincing. That the chair of the committee behind DSM-4, in spite of feeling strongly about it, cannot persuasively criticize DSM-5 speaks volumes.
  • The Lying Dutchman. “Very few social psychologists make stuff up, but he was working in a discipline where cavalier use of data was common. This is perhaps the main finding of the three Dutch academic committees which investigated his fraud. The committees found many bad practices: researchers who keep rerunning an experiment until they get the right result, who omit inconvenient data, misunderstand statistics, don’t share their data, and so on.”

Few Doctors Understand Statistics?

A few days ago I wrote about a study that suggested that people who’d had bariatric surgery were at much higher risk of liver poisoning from acetaminophen than everyone else. I learned about the study from an article by Erin Allday in the San Francisco Chronicle. The article included this:

At this time, there is no reason for bariatric surgery patients to be alarmed, and they should continue using acetaminophen if that’s their preferred pain medication or their doctor has prescribed it.

This was nonsense. The evidence for a correlation between bariatric surgery and risk of acetaminophen poisoning was very strong. Liver poisoning is very serious. Anyone who’s had bariatric surgery should reduce their acetaminophen intake.

Who had told Allday this nonsense? The article attributed it to “the researchers” and “weight-loss surgeons”. I wrote Allday to ask.

She replied that everyone she’d spoken to for the article had told her that people with bariatric surgery shouldn’t be alarmed. She did not understand why I considered the statement (“no need for alarm”) puzzling. I replied:

The statement is puzzling because it is absurd. The evidence that acetaminophen is linked to liver damage in people with bariatric surgery is very strong. Perhaps the people you spoke to didn’t understand that. The size of the sample (“small”) is irrelevant. Statisticians have worked hard to be able to measure the strength of the evidence independent of sample size. In this case, their work reveals that the evidence is very strong.

If the experts you spoke to (a) didn’t understand statistics and (b) were being cautious, that would be forgivable. That’s not the case here. They (a) don’t understand statistics and (b) are being reckless. With other people’s health. It’s fascinating, and very disturbing, that all the experts you spoke to were like this.

I have no reason to think that the people Allday talked to were more ignorant than typical doctors. I expect researchers to be better at statistics than average doctors. One possible explanation of what Allday was told is that most doctors, given a test of basic statistical concepts, would flunk. Not only do they fail to understand statistics, they don’t understand that they don’t understand. Another possible explanation is that most doctors have a strong “doctors do everything right” bias, even when it endangers patients. Either way, bad news.

Doctor Logic: “Acne is Caused by Bacteria”

Presumably Dr. Jenny Kim is a good dermatologist because the author of this NPR piece chose to quote her:

UCLA dermatologist Dr. Jenny Kim says many people don’t realize it’s bacteria that cause acne. “Some people say your face is dirty, you need to clean it more, scrub more, don’t eat chocolate, things like that. But really, it’s caused by bacteria and the oil inside the pore allows the bacteria to overpopulate,” Kim says.

If I were to ask Dr. Kim how she knows that acne is “caused by bacteria” I think she’d say “because when you kill the bacteria [with antibiotics] the acne goes away.”  Suppose I then asked: “Is there evidence that the bacteria of people who get acne differ from the bacteria of people who don’t get acne (before the acne)?” What I assume Dr. Kim would answer: “I don’t know.”

There is no such evidence, I’m sure. It is quite plausible that the bacteria of the two groups (with and without acne) are exactly the same, at least before acne. If it turned out, upon investigation, that the bacteria of people who get acne is the same as the bacteria of people who don’t get acne, that would make it much harder to say that acne is caused by bacteria. As far as I can tell, Dr. Kim and apparently all influential dermatologists have not thought even this deeply about it. To do so would be seriously inconvenient, because if acne isn’t caused by bacteria, it would be harder to justify prescribing antibiotics. Which dermatologists have been doing  for decades.

It isn’t just dermatologists. Many doctors believe that H. pylori causes ulcers — wasn’t a Nobel Prize given for discovering that? The evidence for that assertion consisted of: 1. H. pylori found at ulcers. 2. Doctor swallowed billions of H. pylori and didn”t get an ulcer. (Not a typo.) It was enough that he got indigestion or something. 3. Antibiotics cause ulcers to heal. That was enough for the two doctors who made the H. pylori case and the Nobel Prize committee they convinced. The doctors and the committee failed to know or understand that H. pylori infection is very common and almost no one who is infected gets an ulcer. Psychiatric causal reasoning has been even simpler and even more self-serving. We know that depression — a huge problem — is due to “a chemical imbalance”, according to many psychiatrists, because (a) antidepressants work (not very well) and (b) antidepressants change brain chemistry.

Dr. Kim’s false certainty matters because I’m sure most people with acne don’t know what causes it. I didn’t. Dr. Kim’s false certainty and similar statements from other dermatologists make it harder for them to find out.  I wrote about a woman who figured out what caused her acne. It wasn’t easy or obvious.

Thanks to Bryan Castañeda.

Two Recent Health Care Experiences

A friend and his pregnant wife, who live in Los Angeles and are not poor, recently had an ultrasound.  (Probability of the ultrasound machine not operating properly and producing more than the stated amounts of energy: unknown, but a recent Stockholm survey found one-third of the machines malfunctioned.) Part of the office visit was a post-ultrasound visit with a genetic counselor.  The genetic counselor walked them through illnesses in their family tree and assessed their coming baby with very low risk for Trisomy 21 (Down syndrome), Trisomy 13 and Trisomy 18.

At the end of their session, they were offered other services they might opt to buy to better know their chances of knowing about any fetal problems: Chorionic villus sampling and amniocentesis as well as a maternal blood test.  None were really necessary.

My friend was irked that the CVS and the amniocentesis were called “low risk”. Maybe you know that a large fraction of doctors claim to practice “evidence-based medicine”. You might think this means they pay attention to all evidence. In fact, evidence-based medicine practitioners subscribe to a method of ranking evidence and ignore evidence that is not highly ranked. Most evidence of harm is not highly ranked, so evidence-based medicine practitioners ignore it. This makes every treatment appear less dangerous — misleadingly so. When a doctor says “low risk,” the truth, because the practice of ignoring evidence of harm is widespread (and drug companies routinely underestimate risk), is closer to “unknown risk”. The combination of (a) understating risk, (b) selling unnecessary stuff of which you have understated the risk, and (c) doing this with pregnant women, whose fetuses are especially vulnerable, is highly unattractive.

Also recently, the friend’s toddler had some sort of infection. The toddler had a bit of a fever, but was generally in good spirits, and played with his toys (i.e., was not bed-ridden or in severe distress). After a few days, his wife took the child to their pediatrician to make sure everything was fine.

“Don’t just accept the antibiotics,” my friend told his wife. “Push back a little. See what happens.”

The pediatrician did prescribe antibiotics. When my friend’s wife said she preferred not to give the child antibiotics if it were not really necessary, the doctor (female) said, “You’re right. I actually don’t know if the infection is bacterial or viral.”

Both stories — which obviously reflect common practice — illustrate how the healthcare system is biased toward treatment, including treatments that are unnecessary and dangerous. The good news is that this bias is clearer than ever before.

Why Self-Track? The Possibility of Hard-to-Explain Change

My personal science introduced me to a research method I have never seen used in research articles or described in discussions of scientific method. It might be called wait and see. You measure something repeatedly, day after day, with the hope that at some point it will change dramatically and you will be able to determine why. In other words: 1. Measure something repeatedly, day after day. 2. When you notice an outlier, test possible explanations. In most science, random (= unplanned) variation is bad. In an experiment, for example, it makes the effects of the treatment harder to see. Here it is good.

Here are examples where wait and see paid off for me:

1. Acne and benzoyl peroxide. When I was a graduate student, I started counting the number of pimples on my face every morning. One day the count improved. It was two days after I started using benzoyl peroxide more regularly. Until then, I did not think benzoyl peroxide worked well — I started using it more regularly because I had run out of tetracycline (which turned out not to work).

2. Sleep and breakfast. I changed my breakfast from oatmeal to fruit because a student told me he had lost weight eating foods with high water content (such as fruit). I did not lose weight but my sleep suddenly got worse. I started waking up early every morning instead of half the time. From this I figured out that any breakfast, if eaten early, disturbed my sleep.

3. Sleep and standing (twice). I started to stand a lot to see if it would cause weight loss. It didn’t, but I started to sleep better. Later, I discovered by accident that standing on one leg to exhaustion made me sleep better.

4. Brain function and butter. For years I measured how fast I did arithmetic. One day I was a lot faster than usual. It turned out to be due to butter.

5. Brain function and dental amalgam. My brain function, measured by an arithmetic test, improved over several months. I eventually decided that removal of two mercury-containing fillings was the likely cause.

6. Blood sugar and walking. My fasting blood sugar used to be higher than I would like — in the 90s. (Optimal is low 80s.) Even worse, it seemed to be increasing. (Above 100 is “pre-diabetic.”) One day I discovered it was much lower than expected (in the 80s). The previous day I had walked for an hour, which was unusual. I determined it was indeed cause and effect. If I walked an hour per day, my fasting blood sugar was much better.

This method and examples emphasize the point that different scientific methods are good at different things and we need all of them (in contrast to evidence-based medicine advocates who say some types of evidence are “better” than other types — implying one-dimensional evaluation). One thing we want to do is test cause-effect ideas (X causes Y). This method doesn’t do that at all. Experiments do that well, surveys are better than nothing. Another thing we want to do is assess the generality of our cause-effect ideas. This method doesn’t do that at all. Surveys do that well (it is much easier to survey a wide range of people than do an experiment with a wide range of people), multi-person experiments are better than nothing. A third thing we want to do is come up with cause-effect ideas worth testing. Most experiments are a poor way to do this, surveys are better than nothing. This method is especially good for that.

The possibility of such discoveries is a good reason to self-track. Professional scientists almost never use this method. But you can.

“How Ignorant Doctors Kill Patients”

I have already linked to this 2004 article (“How Ignorant Doctors Kill Patients”) by Russell Blaylock, a neurosurgeon, but after rereading think it deserves a second link and extended quotation.

I recently spoke to a large group concerning the harmful effects of glutamate, explaining it is now known that glutamate, as added to foods, significantly accelerates the growth and spread of cancers. I [rhetorically] asked the crowd when was the last time an oncologist told his or her patient to avoid MSG or foods high in glutamate. The answer, I said, was never.

After the talk, a crowd gathered to ask more questions. Suddenly I was interrupted by a young woman who identified herself as a radiation oncologist. She angrily stated, “I really took offense to your comment about oncologists not telling their patients about glutamate.”

I turned to her and asked, “Well, do you tell your patients to avoid glutamate?” She looked puzzled and said, “No one told us to.” I asked her who this person or persons were whose job it was to provide her with this information. I then reminded her that I obtained this information from her oncology journals. Did she not read her own journals?

Yet, this is the attitude of the modern doctor. An elitist group is in charge of disseminating all the information physicians are to know. If they do not tell them, then, in their way of thinking, the information was of no value.

The incentive structure of modern medicine in action. If you do harm, you are not punished — thus the high error rate. If you do good, you are not rewarded — so why bother to think (“no one told us”)? The similarity to pre-1980 Chinese communism, where it didn’t matter if you were a good farmer or a bad farmer, is obvious. It is a big step forward that the rest of us can now search the medical literature and see the evidence for ourselves.

Overtreatment in US Health Care

In April there was a conference in Cambridge, Massachusetts, about how to reduce overtreatment in American health care. Attendees were told:

The first randomised study of coronary artery bypass surgery was not carried out until 16 years after the procedure was first developed, a conference on overtreatment in US healthcare was told last week. When the results were published, they “provided no comfort for those doing the surgery,” as it showed no mortality benefit from surgery for stable coronary patients.

One participant said that overtreatment cost one-third of US health care spending. As far as I can tell, no one said that “evidence-based medicine” underestimates — in the case of tonsillectomies, almost completely ignores — bad effects of treatments. This failure to anticipate and accurately measure bad effects of treatments makes the overall picture worse. Maybe much worse.

Merck’s Vioxx and the American Death Rate

Ron Unz makes a very good point — that just one awful drug (Vioxx) sold by just one awful drug company (Merck) appear to have caused hundreds of thousands of deaths:

The headline of the short article that ran in the April 19, 2005 edition of USA Today was typical: “USA Records Largest Drop in Annual Deaths in at Least 60 Years.” During that one year, American deaths had fallen by 50,000 despite the growth in both the size and the age of the nation’s population. Government health experts were quoted as being greatly “surprised” and “scratching [their] heads” over this strange anomaly, which was led by a sharp drop in fatal heart attacks. . . .

On April 24, 2005, the New York Times ran another of its long stories about the continuing Vioxx controversy, disclosing that Merck officials had knowingly concealed evidence that their drug greatly increased the risk of heart-related fatalities. . . .

A cursory examination of the most recent 15 years worth of national mortality data provided on the Centers for Disease Control and Prevention website offers some intriguing clues to this mystery. We find the largest rise in American mortality rates occurred in 1999, the year Vioxx was introduced, while the largest drop occurred in 2004, the year it was withdrawn. Vioxx was almost entirely marketed to the elderly, and these substantial changes in national death-rate were completely concentrated within the 65-plus population. The FDA studies had proven that use of Vioxx led to deaths from cardiovascular diseases such as heart attacks and strokes, and these were exactly the factors driving the changes in national mortality rates.

The impact of these shifts was not small. After a decade of remaining roughly constant, the overall American death rate began a substantial decline in 2004, soon falling by approximately 5 percent, despite the continued aging of the population. This drop corresponds to roughly 100,000 fewer deaths per year. The age-adjusted decline in death rates was considerably greater.

This illustrates how Merck company executives got away with mass murder on a scale that the Khmer Rouge would be proud of. It also illustrates why I find “evidence-based medicine” as currently practiced so awful. Evidence-based medicine tells doctors to be evidence snobs. As I showed in my Boing Boing article about tonsillectomies, it causes them to ignore evidence of harm — such as heart attacks and strokes caused by Vioxx — because the first evidence of harm does not come from randomized controlled studies, the only evidence they accept. It delays the detection of monumental tragedies like this one.

Tonsillectomy Confidential

I wrote a piece for Boing Boing about tonsillectomies that has just been posted. It stemmed from a comment on this blog by a woman named Rachael. A doctor said her son should have a tonsillectomy. When Rachael did her own research, however, it seemed to her that the risks outweighed the benefits. I looked further into tonsillectomies and found that the risks were routinely greatly understated, even by advocates of evidence-based medicine.

More Here is a page on a doctor-run website called MedicineNet that grossly understates the risks of tonsillectomies. Compare their list of possible bad effects to mine.

Assorted Links

  • Top ten excuses for climate scientists behaving badly. For example, “the emails are old” and “the timing is suspicious”.
  • Scientific retractions are increasing. My guess is that retractions are increasing because scientific work has become easier to check. Tools are cheaper, for example.
  • More Dutch scientific misconduct. “Professor Poldermans published more than 600 scientific papers in a wide range of journals, including JAMA and the New England Journal of Medicine.”
  • The next time someone praises “evidence-based medicine”, ask them: What about Accutane? It illustrates how evidence-based medicine encourages dangerous drugs. You can’t make lots of money from cheap, time-tested things that we know to be safe (such as dietary changes) so the drug industry revolves around things that are not time-tested and therefore dangerous  — far more dangerous than dietary changes. Evidence-based medicine, which says that certain tests (expensive) are much better than other tests (cheap), provides cover for this. Because the required tests are so expensive, they are allowed to be short.

Thanks to Allan Jackson.

Assorted Links

  • Salem Comes to the National Institutes of Health. Dr. Herbert Needleman is harassed by the lead industry, with the help of two psychology professors.
  • Climate scientists “perpetuating rubbish”.
  • A humorous article in the BMJ that describes evidence-based medicine (EBM) as a religion. “Despite repeated denials by the high priests of EBM that they have founded a new religion, our report provides irrefutable proof that EBM is, indeed, a full-blown religious movement.” The article points out one unquestionable benefit of EBM — that some believers “demand that [the drug] industry divulge all of its secret evidence, instead of publishing only the evidence that favours its products.” Of course, you need not believe in EBM to want that. One of the responses to the article makes two of the criticisms of EBM I make: 1. Where is the evidence that EBM helps? 2. EBM stifles innovation.
  • What really happened to Dominique Strauss-Kahn? Great journalism by Edward Jay Epstein.  This piece, like much of Epstein’s work, sheds a very harsh light on American mainstream media. They were made fools of by enemies of Strauss-Kahn. Epstein is a freelance journalist. He uncovered something enormously important that all major media outlets — NY Times, Washington Post, The New Yorker, ABC, NBC, CBS (which includes 60 Minutes), the AP, not to mention French news organizations, all with great resources — missed.

Evidence-Based Medicine Versus Innovation

In this interview, a doctor who does research on biofilms named Randall Wolcott makes the same point I made about Testing Treatments — that evidence-based medicine, as now practiced, suppresses innovation:

I take it you [meaning the interviewer] are familiar with evidence-based medicine? It’s the increasingly accepted approach for making clinical decisions about how to treat a patient. Basically, doctors are trained to make a decision based on the most current evidence derived from research. But what such thinking boils down to [in practice — theory is different] is that I am supposed to do the same thing that has always been done – to treat my patient in the conventional manner – just because it’s become the most popular approach. However, when it comes to chronic wound biofilms, we are in the midst of a crisis – what has been done and is accepted as the standard treatment doesn’t work and doesn’t meet the needs of the patient.

Thus, evidence-based medicine totally regulates against innovation. Essentially doctors suffer if they step away from mainstream thinking. Sure, there are charlatans out there who are trying to sell us treatments that don’t work, but there are many good therapies that are not used because they are unconventional. It is only by considering new treatment options that we can progress.

Right on. He goes on to say that he is unwilling to do a double-blind clinical trial in which some patients do not receive his new therapy because “we know we’ve got the methods to save most of their limbs” from amputation.

Almost all scientific and intellectual history (and much serious journalism) is about how things begin. How ideas began and spread, how inventions are invented. If you write about Steve Jobs, for example, that’s your real subject. How things fail to begin — how good ideas are killed off — is at least as important, but much harder to write about. This is why Tyler Cowen’s The Great Stagnation is such an important book. It says nothing about the killing-off processes, but at least it describes the stagnation they have caused. Stagnation should scare us. As Jane Jacobs often said, if it lasts long enough, it causes collapse.

Thanks to Heidi.

Testing Treatments: The Authors Respond

In a previous post I criticized the book Testing Treatments. Two of the authors, Paul Glasziou and Iain Chalmers, have responded. I have replied to their response. They did not respond to the main point of my post, which is that the preferences and values of their book — called evidence-based medicine — hinder innovation.

Sure, care about evidence. Of course. But don’t be an evidence snob.

Testing Treatments: Nine Questions For the Authors

From this comment (thanks, Elizabeth Molin) I learned of a British book called Testing Treatments (pdf), whose second edition has just come out. Its goal is to make readers more sophisticated consumers of medical research. To help them distinguish “good” science from “bad” science. Ben Goldacre, the Bad Science columnist, fulsomely praises it (“I genuinely, truly, cannot recommend this awesome book highly enough for its clarity, depth, and humanity”). He wrote a foreword. The main text is by Imogen Evans (medical journalist), Hazel Thornton (writer),  Iain Chalmers (medical researcher), and Paul Glaziou (medical researcher, editor of Journal of Evidence-Based Medicine).

To me, as I’ve said, medical research is almost entirely bad. Almost all medical researchers accept two remarkable rules: (a) first, let them get sick and (b) no cheap remedies. These rules severely limit what is studied. In terms of useful progress, the price of these limits has been enormous: near total enfeeblement. For many years the Nobel Prize in Medicine has documented the continuing failure of medical researchers all over the world to make significant progress on all major health problems, including depression, heart disease, obesity, cancer, diabetes, stroke, and so on. It is consistent with their level of understanding that some people associated with medicine would write a book about how to do something (good science) the whole field manifestly can’t do. Testing Treatments isn’t just a fat person writing a book about how to lose weight, it’s the author failing to notice he’s fat. Continue readingTesting Treatments: Nine Questions For the Authors”

Monocultures of Evidence

After referring to Jane Jacobs (“successful city neighborhoods need a mixture of old and new buildings”), which I liked, Tim Harford wrote this, which I didn’t like:

Many medical treatments (and a few social policies) have been tested by randomized trials. It is hard to imagine a more clear-cut practice of denying treatment to some and giving it to others. Yet such lotteries — proper lotteries, too — are the foundation of much medical progress.

The notion of evidence-based medicine was a step forward in that it recognized that evidence mattered. It was only a small step forward, however, because its valuation of evidence — on a single dimension, with double-blind randomized trials at the top — was naive. Different sorts of decisions need different sorts of evidence, just as Jacobs said different sorts of businesses need different sorts of buildings. In particular, new ideas need cheap tests, just as new businesses need cheap rent. As an idea becomes more plausible, it makes sense to test it in more expensive ways. That is one reason a monoculture of evidence is a poor idea.

Another is that you should learn from the past. Sometimes a placebo effect is plausible; sometimes it isn’t. To ignore this and insist everything should be placebo-controlled is to fail to learn a lot you could have learned.

A third reason a monoculture of evidence is a poor idea is that it ignores mechanistic understandings — understanding of what causes this or that problem. In some cases, you may think that the disorder you are studying has a single cause (e.g., scurvy). In other cases, you may think the problem probably has several causes (e.g., depression, often divided into endogenous and exogenous). In the latter case, it is plausible that a treatment will help only some of those with the problem. So you should design your study and analyze your data taking into account that possibility. You may want to decide for each subject whether or not the treatment helped rather than lump all subjects together. And the “best” designs will be those that best allow you to do this.

The Problem with Evidence-Based Medicine

In a recent post I said that med school professors cared about process (doing things a “correct” way) rather than result (doing things in a way that produces the best possible outcomes). Feynman called this sort of thing “cargo-cult science“. The problem is that there is little reason to think the med-school profs’ “correct” way (evidence-based medicine) works better than the “wrong” way it replaced (reliance on clinical experience) and considerable reason to think it isn’t obvious which way is better.

After I wrote the previous post, I came across an example of the thinking I criticized. On bloggingheads.tv, during a conversation between Peter Lipson (a practicing doctor) and Isis The Scientist (a “physiologist at a major research university” who blogs at ScienceBlogs), Isis said this:

I had an experience a couple days ago with a clinician that was very valuable. He said to me, “In my experience this is the phenomenon that we see after this happens.” And I said, “Really? I never thought of that as a possibility but that totally fits in the scheme of my model.” On the one hand I’ve accepted his experience as evidence. On the other hand I’ve totally written it off as bullshit because there isn’t a p value attached to it.

Isis doesn’t understand that this “p value” she wants so much comes with a sensitivity filter attached. It is not neutral. To get it you do extensive calculations. The end result (the p value) is more sensitive to some treatment effects than others in the sense that some treatment effects will generate smaller (better) p values than other treatment effects of the same strength, just as our ears are more sensitive to some frequencies than others.

Our ears are most sensitive around the frequency of voices. They do a good job of detecting what we want to detect. What neither Isis nor any other evidence-based-medicine proponent knows is whether the particular filter they endorse is sensitive to the treatment effects that actually exist. It’s entirely possible and even plausible that the filter that they believe in is insensitive to actual treatment effects. They may be listening at the wrong frequency, in other words. The useful information may be at a different frequency.

The usual statistics (mean, etc.) are most sensitive to treatment effects that change each person in the population by the same amount. They are much less sensitive to treatment effects that change only a small fraction of the population. In contrast, the “clinical judgment” that Isis and other evidence-based-medicine advocates deride is highly sensitive to treatments that change only a small fraction of the population — what some call anecdotal evidence. Evidence-based medicine is presented as science replacing nonsense but in fact it is one filter replacing another.

I suspect that actual treatment effects have a power-law distribution (a few helped a lot, a large fraction helped little or not at all) and that a filter resembling “clinical judgment” does a better job with such distributions. But that remains to be seen. My point here is just that it is an empirical question which filter works best. An empirical question that hasn’t been answered.

How to Base Medicine on Evidence

The thing to notice about what the New York Times calls “the evidence-based medicine practiced at Intermountain hospital” is how different it is than the movement called evidence-based medicine. The Intermountain stuff, above all, is not black-and-white thinking. It is a good example of what the opposite looks like. The rules aren’t simple, they are complex, and not fixed. It is what engineers in other areas have been doing since Deming.

So many scientists — not to mention everyone else — are completely paralyzed, rendered completely useless, by their black-and-white thinking. It feels good to them — they love the certainty of it, and the power it gives them to look down on others — and they never quite realize what it has done to them. The notion of using evidence to improve health care made perfect sense — until black-and-white thinkers got a hold of it.

Any class in scientific method should be at least half about avoiding black-and-white thinking. They never are.