Posit Science is a San Francisco company, started by Michael Merzenich (UCSF) and others, that sells access to brain-training exercises aimed at older adults. Their training program, they say, will make you “remember more”, “focus better”, and “think faster”. A friend recently sent me a 2011 paper (“Improvement in memory with plasticity-based adaptive cognitive training: results of the 3-month follow-up” by Elizabeth Zelinski and others, published in the Journal of the American Geriatrics Society) that describes a study about Posit Science training. The study asked if the improvements due to training are detectable three months after training stops. The training takes long enough (1 hour/day in the study) that you wouldn’t want to do it forever. The study appears to have been entirely funded by Posit Science.
I found the paper puzzling in several ways. I sent the corresponding author and the head of Posit Science a list of questions:
1. Isn’t it correct that after three months there was no longer reliable improvement due to training according to the main measure that was chosen by you (the investigators) in advance? If so, shouldn’t that have been the main conclusion (e.g., in the abstract and final paragraph)?
2. The training is barely described. The entire description is this: “a brain plasticity-based computer program designed to improve the speed and accuracy of auditory information processing and to engage neuromodulatory systems.” To learn more, readers are referred to a paper that is not easily available — in particular, I could not find it on the Posit Science website. Because the training is so briefly described, I was unable to judge how much the outcome tests differ from the training tasks. This made it impossible for me to judge how much the training generalizes to other tasks — which is the whole point. Why wasn’t the training better described?
3. What was the “ET [experimental treatment] processing speed exercise”? It sounds like a reaction-time task. People will get faster at any reaction-time task if given extensive practice on that task. How is such improvement relevant to daily life? If it is irrelevant, why is it given considerable attention (one of the paper’s four graphs)?
4. According to Table 2, the CSRQ (Cognitive Self-Report Questionnaire) questions showed no significant improvement in trainees’ perceptions of their own daily cognitive functioning, although the p value was close to 0.05. Given the large sample size (~500), this failure to find significant improvement suggests the self-report improvements were small or zero. Why wasn’t this discussed? Is the amount of improvement suggested by Posit Science’s marketing consistent with these results?
5. Is it possible that the improvement subjects experienced was due to the acquisition of strategies for dealing with rapidly presented auditory material, and especially for focusing on the literal words (rather than on their meaning, as may be the usual approach taken in daily life)? If so, is it possible that the skills being improved have little value in daily life, explaining the lack of effect on the CSRQ?
6. In the Methods section, you write “In the a priori data analysis plan for the IMPACT Study, it was hypothesized that the tests constituting the secondary outcome measure would be more sensitive than the RBANS given their larger raw score ranges and sensitivity to cognitive aging effects.” Do the initial post-training tests (measurements of the training effect soon after training ended) support this hypothesis? Why aren’t the initial post-training results described so that readers can see for themselves if this hypothesis is plausible? If you thought the “secondary outcome measure would be more sensitive than the RBANS” why wasn’t the secondary outcome measure the primary measure?
7. The primary outcome measure was some of the RBANS (Repeatable Battery for the Assessment of Neuropsychological Status). Did subjects take the whole RBANS or only part of it? If they took the whole RBANS, what were the results with the rest of the RBANS (the subtests not included in the primary outcome measure)?
8. The data analysis refers to a “secondary composite measure”. Why that particular composite and not any of the many other possible composite measures? Were other secondary composite measures considered? If so, were p values corrected for this?
9. If Test A resembles training more closely than Test B, Test A should show more effect of training (at any retention interval) than Test B. In this case Test A = the RBANS auditory subtests and Test B = the secondary composite measure. In contrast to this prediction, you found that Test B showed a clearer training effect (in terms of p value) than Test A. Why wasn’t this anomaly discussed (beyond what was said in the Methods section)?
10. Were any tests given the subjects not described in this report? If there were other tests, why were their results not described?
11. The secondary composite measure is composed of several memory tests and called “Overall Memory”. The Posit Science website says their training will not only help you “remember more” but also “think faster” and “focus better”. Why weren’t tests of thinking speed (different from the training tasks) and focus included in the assessment?
12. Do the results support the idea that the training causes trainees to “focus better”?
13. The Posit Science homepage suggests that their training increases “intelligence”. Was intelligence measured in this study? If not, why not?
14. Do the results support the idea that the training causes trainees to become more intelligent?
15. The only test of thinking speed included in the assessment appears to be a reaction-time task that was part of the training. Are you saying that getting faster on one reaction-time task after lots of practice with that task shows that your training causes trainees to “think faster”?
Update: Henry Mahncke, the head of Posit Science, said that he would be happy to answer these questions by phone. I replied that I was sure many people were curious about the answers and written answers would be much easier to share.
Further update: Mahncke replied that he would prefer a phone call and that some of the questions seemed to him hard to answer in writing. He said nothing about the sharing problem. I repeated my belief that many people are interested in the answers and that a phone call would be hard to share. I offered to rewrite any questions that seemed hard to answer in writing.