Suppose you have a car that can only turn right. Someone says, Your car turns right too much. You might wonder why they don’t see the bigger problem (can’t turn left).
This happens in science today. People complain about how well the car turns right, failing to notice (or at least say) it can’t turn left. Just as a car should turn both right and left, scientists should be able to (a) test ideas and (b) generate ideas worth testing. Tests are expensive. To be worth the cost of testing, an idea needs a certain plausibility. In my experience, few scientists have clear ideas about how to generate ideas plausible enough to test. The topic is not covered in any statistics text I have seen — the same books that spend many pages on to how to test ideas.
Apparently not noticing the bigger problem, scientists sometimes complain that this or that finding “fails to replicate”. My former colleague Danny Kahneman is an example. He complained that priming effects were not replicating. Implicit in a complaint that Finding X fails to replicate is a complaint about testing. If you complain that X fails to replicate, you are saying that something was wrong with the tests that established X. There is a connection between replication failure and failure to generate ideas worth testing. If you cannot generate new ideas, you are forced to test old ideas. You cannot test an old idea exactly — that would be boring/repetitive. So you give an old idea a slight tweak and test the variation. For example, someone has shown that X is true in North America. You ask if X is true in South America. You hope you haven’t tweaked X too much. No idea is true everywhere, except maybe in physics, so as this process continues — it goes on for decades — the tested ideas gradually become less true and the experimental effects get weaker. This is what happened in the priming experiments that Kahneman complained about. At the core of priming — the priming effects studied 30 years ago — is a true phenomenon. After reading “doctor” it becomes easier to decide that “nurse” is a word, for example. This was followed by 30 years of drift away from word recognition. Not knowing how to generate new ideas worth testing, social psychologists have ended up studying weak effects (recent priming effects) that are random walks away from strong effects (old priming effects). The weak effects cannot bear the professional weight (people’s careers rest on them) they are asked to carry and sometimes collapse (“failure to replicate”). Sheena Iyengar, a Columbia Business School professor and social psychologist, got a major award (best dissertation) for and wrote a book about a new effect that has turned out to be very close to non-existent. Inability to generate ideas — to understand how to do so — means that what appear to be new ideas (not just variations of old ideas) are more likely to be mistakes. I have no idea whether Iyengar’s original effect was true or not. I am sure, however, that it was weak and made little sense.
Statistics textbooks ignore the problem. They say nothing about how to generate ideas worth testing. I haven’t asked statisticians about this, but they might respond in one of two ways: 1. That’s someone else’s problem. Statistics is about what to do with data after you gather it. That makes as much sense as teaching someone how to land a plane but not how to take off. 2. That’s what exploratory data analysis is for. If I said “Exploratory data analysis can only identify effects of factors that the researcher decided to vary or track. Which is expensive. What about other factors?” they’d be baffled, I believe. In my experience, exploratory data analysis = full analysis of your data. (Many people do only a small fraction, such as 10%, of all reasonable analyses of their data.) Full analysis is better than partial analysis, but calling it a way to find new ideas fails to understand that professional scientists study the same factors over and over.
I suppose many scientists feel the gap acutely. I did. I became interested in self-experimentation most of all because it generated new ideas at a much higher rate (per year) than my professional experiments with rats. I had no idea why, at first, but as it kept happening — my self-experimentation generated one new idea after another. I came to believe that by accident I was doing something “right”. I was doing something that fit a general rule of how to generate ideas, even though I didn’t know what the general rule was.
The sciences I know about (psychology and nutrition) have great trouble coming up with new ideas. The paleo movement is a response to stagnation in the field of nutrition. The Shangri-La Diet shows what a new idea looks like in the area of weight control. The failure of nutritionists to study fermented foods is ongoing. Stagnation in psychology can be seen in the fact that antidepressants remain heavily prescribed, many years after the introduction of Prozac (my work on morning faces and mood suggests a much different approach), lack of change in treatments for bipolar disorder over the last 50 years (again, my morning-faces work suggests another approach), and in the failure of social psychologists to discover any big new effects in the last ten years.
Here is the secret to idea generation: Cheaper tests. To find ideas plausible enough to be worth testing with Test X, you need a way of testing ideas that is cheaper than Test X. The cheaper your test, the larger the region of cause-effect space you can explore. Let’s say Test Y is cheaper than Test X. With Test Y, you can explore more of cause-effect space than you can explore with Test X. In the region unexplored by Test X, you can find points (cause-effect relationships) that pass Test Y. They are worth testing with Test X. My self-experimentation generated new ideas worth testing with more expensive tests because it was much cheaper than existing tests. Via self-experimentation, I could test many ideas too implausible or too expensive to be tested conventionally. Even cheaper than a self-experiment was simply monitoring myself — tracking my sleep, for example. Again and again, this generated ideas worth testing via self-experimentation. I did what all scientists should do: use cheaper tests to generate ideas worth testing with more expensive tests.