Systematic review shows no improvement in quality of mindfulness research in 16 years

Should we still take claims about mental health benefits of mindfulness with a grain of  salt? A systematic review by one of mindfulness training’s key promoters suggests maybe so.

saltCritics have been identifying the same weaknesses in mindfulness research for almost two decades. This review suggests little improvement in 16 years the quality of randomized trials for mental health problems.

This study examined 171 articles reporting RCTs for:

(a) active control conditions, (b) larger sample sizes, (c) longer follow-up assessment, (d) treatment fidelity assessment, (e) reporting of instructor training, (f) reporting of ITT samples.

What was missed

Whether articles reporting RCTs had appropriate disclosure of financial or other conflicts of interest. COI pose significant risk of bias, especially when they are not reported.

This article discloses authors’ interests. One of the authors, Richard Davidson is a prominent promoter of mindfulness training.  A Web of Science search of Davidson RJ and mindfulness yielded 26 articles from 2002 to 2016. It would be interesting to check in see if these consistent weaknesses in mindfulness research are mentioned in these articles. To what extent do RCTs with Davidson as an author had these weaknesses, like being underpowered?

Critic: You say financial interests or other investments in a treatment are a risk of bias. Yet, this article is critical of mindfulness research. Wouldn’t you expect a more positive appraisal of the literature because of the authors having a confirmation bias?

Not necessarily. Conflicts of interest are a risk of bias, but don’t discredit an author, They only alert readers to be skeptical. Furthermore, the weaknesses in this literature are so pervasive, it would be difficult to put a positive spin on them.  Besides calling attention to specific weaknesses that need to be addressed in future research can become part of a pitch for more research.

The article

Goldberg SB, Tucker RP, Greene PA, Simpson TL, Kearney DJ, Davidson RJ. Is mindfulness research methodology improving over time? A systematic review. PLOS One. 2017 Oct 31;12(10):e0187298.

End of paper conclusion:

In conclusion, the 16 years of mindfulness research reviewed here provided modest evidence that the quality of research is improving over time. There may be various explanations for this (e.g., an increasing number of novel mindfulness-based interventions being first tested in less rigorous designs; the undue influence of early, high-quality studies). However, it is our hope that demonstrating this fact empirically will encourage future researchers to work towards the recommendations here and ultimately towards a clearer and scientifically-informed understanding of the potential and limitations of these treatments.

From the abstract


The current systematic review examined the extent to which mindfulness research demonstrated increased rigor over the past 16 years regarding six methodological features that have been highlighted as areas for improvement. These feature included using active control conditions, larger sample sizes, longer follow-up assessment, treatment fidelity assessment, and reporting of instructor training and intent-to-treat (ITT) analyses.

Data sources

We searched PubMed, PsychInfo, Scopus, and Web of Science in addition to a publically available repository of mindfulness studies.

Study eligibility criteria

Randomized clinical trials of mindfulness-based interventions for samples with a clinical disorder or elevated symptoms of a clinical disorder listed on the American Psychological Association’s list of disorders with recognized evidence-based treatment.

Study appraisal and synthesis methods

Independent raters screened 9,067 titles and abstracts, with 303 full text reviews. Of these, 171 were included, representing 142 non-overlapping samples.


Across the 142 studies published between 2000 and 2016, there was no evidence for increases in any study quality indicator, although changes were generally in the direction of improved quality. When restricting the sample to those conducted in Europe and North America (continents with the longest history of scientific research in this area), an increase in reporting of ITT analyses was found. When excluding an early, high-quality study, improvements were seen in sample size, treatment fidelity assessment, and reporting of ITT analyses.

Conclusions and implications of key findings

Taken together, the findings suggest modest adoption of the recommendations for methodological improvement voiced repeatedly in the literature. Possible explanations for this and implications for interpreting this body of research and conducting future studies are discussed.

Competing interests

RD is the founder, president, and serves on the board of directors for the non-profit organization, Healthy Minds Innovations, Inc. In addition, RD serves on the board of directors for the Mind and Life Institute. This does not alter our adherence to PLOS ONE policies on sharing data and materials

The variables examined in the systematic review

Six methodological features that have been recommended in criticisms of mindfulness research [10–12. 14]. These include: (a) active control conditions, (b) larger sample sizes, (c) longer follow-up assessment, (d) treatment fidelity assessment, (e) reporting of instructor training, (f) reporting of ITT samples.

…We graded the strength of the control condition on a five-tier system. We defined specific active control conditions as comparison groups that were intended to be therapeutic [17]. More rigorous control groups are important as they can provide a test of the unique or added benefit a mindfulness intervention may offer, beyond non-specific benefits associated with the placebo effect, researcher attention, or demand characteristics [11,14]. Larger sample sizes are important as they increase the reliability of reported effects and increase statistical power [11]. Longer follow-up is important for assessing the degree to which treatment effects are maintained beyond the completion of the intervention [10]. Treatment fidelity assessment allows an examination of the degree to which the given treatment was delivered as intended [12]. Treatment fidelity is commonly assessed through video or audio recordings of sessions that are coded and/or reviewed by treatment experts [18]. We coded all references to treatment fidelity assessment (e.g., sessions were recorded and reviewed, a checklist measuring adherence to specific treatment elements was completed). Relatedly, reporting of instructor training increases the likelihood that the treatment that was delivered by qualified individuals [12], which should, in theory, influence the quality of the treatment provided. Lastly, the reporting of ITT analyses involves including individuals who may have dropped out of the study and/or did not complete their assigned intervention [12]. Generally speaking, ITT analyses are viewed to be more conservative estimates of treatment effects [19,20], and are preferred for this reason.


Unfunny 2017 BMJ Christmas articles not in the same league with my all-time favorite

Which is: A systematic review of parachute use to prevent death and major trauma

I agree with Sharen Begley’s assessment in Stat of this year’s BMJ Christmas issue as a loser.

tenorA BMJ Christmas issue filled with wine glasses, sex, and back pain brings out the Grinch in us

Bah! … humbug. Is it just us, or is the highly anticipated Christmas issue of the BMJ (formerly the British Medical Journal) delivering more lumps of coal and fewer tinselly baubles lately?

Maybe it’s Noel nostalgia, but we find ourselves reminiscing about BMJ offerings from Yuletides past, which brought us studies reporting that 0.5 percent of U.S. births are to (self-reported) virgins, determining how long a box of chocolates lasts on a hospital ward, or investigating Nintendo injuries.

I agree with her stern:

Note to BMJ editors: Fatal motorcycle crashes, old people falling, and joint pain — three of this year’s Christmas issue studies — do not qualify as “lighthearted.”

My all-time favorite BMJ Christmas article

Parachute use to prevent death and major trauma related to gravitational challenge: systematic review of randomised controlled trials


With its conclusions:

As with many interventions intended to prevent ill health, the effectiveness of parachutes has not been subjected to rigorous evaluation by using randomised controlled trials. Advocates of evidence based medicine have criticised the adoption of interventions evaluated by using only observational data. We think that everyone might benefit if the most radical protagonists of evidence based medicine organised and participated in a double blind, randomised, placebo controlled, crossover trial of the parachute.

The brief article is actually a great way to start a serious discussion of randomized trials with a laugh.

Parachutist-splashWe can all agree that we wouldn’t participate in a randomized trial of parachutes. Any effort to conduct a systematic review and meta-analysis of such studies would answer up , formally speaking, as a failed meta-analysis. We we could start with a rigorous systematic search, but still end up with no studies to provide effect sizes. That’s not a bad thing, especially one as an alternative to making recommendations on weak or nonexistent data.

I would like to see a lot more formally declared failed meta-analyses by the Cochrane collaboration.  Clearly labeled failed meta-analyses much preferable to recommendations consumers and policymakers for treatments based on a small collection of methodologically weak and underpowered trials. It happens just too much.

If the discussion group were ripe for it, we could delve into when randomized trials are not needed or what to do when there aren’t randomized trials. I think one or more N = 1 trials not using a parachute or similar device would be compelling, without a nonspecific control group. On the other hand, many interventions that have been justified only by observational trials turn out not to be effective when RCT is finally done.

Keep a discussion going long enough of when RCTs can’t provide suitable evidence, and you end up in a predictable place. Someone will offer a critique of RCTs as the gold standard for evaluating interventions or maybe of systematic reviews meta-analyses of RCTs being a platinum standard. That can be very fruitful too, but sooner or later can get someone proposing alternatives to the RCT because their pet interventions don’t measure up in RCTs. Ah, yes, RCTs can capture the the magic going on long-term psychodynamic psychotherapy.

[For review of alternatives to RCTs that I co-authored, see Research to improve the quality of care for depression: alternatives to the simple randomized clinical trial ]

Ghosts of Christmases Past

Searching for past BMJ Christmas articles can be tedious. If someone can suggest an efficient search term, let me know. Fortunately the BMJ last year offered a review of all-time highlights

Christmas crackers: highlights from past years of The BMJ’s seasonal issue

BMJ 2016; 355 doi: (Published 15 December 2016)

Cite this as: BMJ 2016;355:i6679

For more than 30 years the festive issue of the journal has answered quirky research questions, waxed philosophical, and given us a good dose of humour and entertainment along the way.

A recent count found more than 1000 articles in The BMJ’s Christmas back catalogue. A look through these shows some common themes returning year after year. Professional concerns crop up often, and we seem to be endlessly fascinated by the differences between medical specialties. Past studies have looked at how specialties vary by the cars they drive,7 their ability to predict the future,8 and their coffee buying habits.9 Sometimes the research findings can challenge popular stereotypes. How many people, “orthopods” included, could have predicted that anaesthetists, with their regular diet of Sudoku and crosswords, would fare worse than orthopaedic surgeons in an intelligence test?1

It notes some of the recurring themes.

Beyond medical and academic matters, enduring Christmas themes also reflect the universal big issues that preoccupy us all: food, drink, religion, death, love, and sex.

This broad theme encompasses one of the most widely accessed BMJ Christmas articles of all times

In 2014 Ben Lendrem and colleagues explored differences between the sexes in idiotic risk taking behaviour, by studying past winners of the Darwin Awards.4 As the paper describes: winners of these awards must die in such an idiotic manner that “their action ensures the long-term survival of the species, by selectively allowing one less idiot to survive.”

There is also an interesting table of the four BMJ Christmas papers that won Ig Noble prizes:

Christmas BMJ papers awarded the Ig Nobel prize

  • Effect of ale, garlic, and soured cream on the appetite of leeches (winner 1994)15

  • Magnetic resonance imaging of male and female genitals during coitus and female sexual arousal (1999)13

  • Sword swallowing and its side effects (2006)16

  • Pain over speed bumps in diagnosis of acute appendicitis (2012)17




Psychological interventions do not reduce pain, despite claims of proponents

A provocative review finds a “lack of strong supporting empirical evidence for the effectiveness of psychological treatments for pain management.”

The open access paper

Georgios Markozannes, Eleni Aretouli, Evangelia Rintou, Elena Dragioti, Dimitrios Damigos, Evangelia Ntzani, Evangelos Evangelou and Konstantinos K. Tsilidis. An umbrella review of the literature on the effectiveness of psychological interventions for pain reduction, BMC Psychology

This article received open or public peer review . The three versions, reviewers’ comments, and author responses are available here 

Why this review was needed

According to the review:

Psychological interventions were introduced over 40 years ago and are now well established in clinical practice [5]

…the effect sizes across all meta-analyses are modest, only rising above a medium-size effect (i.e., standardised mean difference larger than 0.5) in lower quality studies [4].

… Because of the wide implementation of psychological interventions in pain management and the elevated likelihood for biases in this field as shown in prior relevant empirical research [19, 20], we used an umbrella review approach [21, 22] that systematically appraises the evidence on an entire field across many meta-analyses. In the present study we aimed to broaden the scope of a typical umbrella review by further evaluating the strength of the evidence and the extent of potential biases [23, 24, 25, 26, 27] on this body of literature.

What is an umbrella review?

 A key source defines an umbrella review:

Ioannidis JP. Integration of evidence from multiple meta-analyses: a primer on umbrella reviews, treatment networks and multiple treatments meta-analyses. CMAJ. 2009;181(8):488–93.

Umbrella reviews (Figure 1) are systematic reviews that consider many treatment comparisons for the management of the same disease or condition. Each comparison is considered separately, and meta-analyses are performed as deemed appropriate. Umbrella reviews are clusters that encompass many reviews. For example, an umbrella review presented data from 6 reviews that were considered to be of sufficiently high quality about nonpharmacological and nonsurgical interventions for hip osteoarthritis. 9 Ideally, both benefits and harms should be juxtaposed to determine trade-offs between the risks and benefits. 10

ioannidis umbrella review

Ioannidis provides the following caveat about umbrella reviews and data syntheses more generally:

Integrating data from multiple meta-analyses may provide a wide view of the evidence landscape. Transition from a single patient to a study of many patients is a leap of faith in generalizability. A further leap is needed for the transition from a single study to meta-analysis and from a traditional meta-analysis to a treatment network and multiple treatments meta-analysis, let alone wider domains. With this caveat, zooming out toward larger scales of evidence may help us to understand the strengths and limitations of the data guiding the medical care of individual patients.

Discrepancy of this review with past evaluations

Our results come in discordance with the generally strong belief in the literature that psychological therapies are universally effective on a variety of pain conditions [76, 77, 78]. However, this belief is mainly established based on a limited number of small primary studies, and future larger studies are warranted. Notably, the median number of individuals in the intervention and control groups in each individual study included in our systematic evaluation was only 33 and 28 respectively, whereas the median number of studies included in each meta-analysis was only three. Our evaluation revealed that the reported effectiveness is usually overstated in the existing studies. The nominally statistically significant associations between psychological interventions and pain were confirmed in less than half of the examined meta-analyses. In addition, the random effects estimates were statistically significant in only 20% of the meta-analyses, when a P-value threshold of 0.001 was applied. Furthermore, in only nine meta-analyses the prediction interval excluded the null value, thus suggesting that only 6% of future studies are expected to demonstrate substantial “positive” (i.e. not null) associations between psychological interventions and pain treatment.

The punchline and the remedy

In conclusion, the present findings support that the effectiveness of psychological treatments for pain management is overstated and the supporting empirical evidence is weak. The present findings combined with the fact that psychological intervention trials are still at an early research stage and fall short compared to drug trials [87] underline the necessity for larger and better-conducted RCTs [85] Future research should further focus on building networks involving all stakeholder groups to achieve consensus and develop guidance on best practices for assessing and reporting pain outcomes [88, 89]. The use of standardized definitions and protocols for exposures, outcomes, and statistical analyses may diminish the threat of biases and improve the reliability of this important literature.

ebook_mindfulness_345x550I will soon be offering e-books providing skeptical looks at mindfulness and positive psychology, as well as scientific writing courses on the web as I have been doing face-to-face for almost a decade.

Sign up at my new website to get advance notice of the forthcoming e-books and web courses, as well as upcoming blog posts at this and other blog sites.  Get advance notice of forthcoming e-books and web courses. Lots to see at

Interventions to reduce stress in university students: Anatomy of a bad meta-analysis

Regehr C, Glancy D, Pitts A. Interventions to reduce stress in university students: A review and meta-analysis. Journal of Affective Disorders. 2013 May 15;148(1):1-1.

I saw a link to this meta-analysis on Twitter. I decided to take a look.

oh-myThe experience added to my sense that many people who tweet, retweet or “like” tweets about studies have either not read the studies or lack of basic understanding of research or both.

What I can redeem for my experience is a commentary is another contribution to screen and quickly dismiss bad meta-analyses.

I have written at length about screening out bad meta-analyses in Hazards of pointing out bad meta-analyses of psychological interventions.

In that blog post, I provide an excellent rationale for complaining about bad meta-analyses from Hilda Bastian. If you click on her name below, you can access an excellent blog post about bad meta-analyses from Hilda as well.

Psychology has a meta-analysis problem. And that’s contributing to its reproducibility problem. Meta-analyses are wallpapering over many research weaknesses, instead of being used to systematically pinpoint them. – Hilda Bastian

Unfortunately, this meta-analysis is behind a pay wall. If you have accessed through a University library, does not pose a problem, only the inconvenience of having to log into your University library website. If you are motivated to do so, you could request a PDF with an email one of the authors at

I don’t think you need to go to the trouble of writing the authors to benefit from my brief analysis. Particularly because you can see the start of the problems in accessing the abstract here.

And here’s the abstract:



Recent research has revealed concerning rates of anxiety and depression among university students. Nevertheless, only a small percentage of these students receive treatment from university health services. Universities are thus challenged with instituting preventative programs that address student stress and reduce resultant anxiety and depression.


A systematic review of the literature and meta-analysis was conducted to examine the effectiveness of interventions aimed at reducing stress in university students. Studies were eligible for inclusion if the assignment of study participants to experimental or control groups was by random allocation or parallel cohort design.


Retrieved studies represented a variety of intervention approaches with students in a broad range of programs and disciplines. Twenty-four studies, involving 1431 students were included in the meta-analysis. Cognitive, behavioral and mindfulness interventions were associated with decreased symptoms of anxiety. Secondary outcomes included lower levels of depression and cortisol.


Included studies were limited to those published in peer reviewed journals. These studies over-represent interventions with female students in Western countries. Studies on some types of interventions such as psycho-educational and arts based interventions did not have sufficient data for inclusion in the meta-analysis.


This review provides evidence that cognitive, behavioral, and mindfulness interventions are effective in reducing stress in university students. Universities are encouraged to make such programs widely available to students. In addition however, future work should focus on developing stress reduction programs that attract male students and address their needs.

I immediately saw that this was a bad abstract because it was so uninformative. There are so many abstracts of meta-analyses freely available on the web. We need to be given the information to recognize when we are confronting an abstract of a bad meta-analysis, so that we can move on. I feel strongly that authors have responsibility to make their abstracts informative. If they don’t in their initial manuscripts, editors and reviewers should insist on improving the abstracts as a condition for publication.

This abstract is faulty because it does not give the effect sizes to back up its claims about the effectiveness of interventions to reduce stress in university students. It also does not comment in any way on the methodological quality of 24 studies that were included. To the unwary reader, it makes the policy recommendation of making stress reduction programs available to students and maybe tailoring such programs so they will attract males.

The authors of abstracts making such recommendations have a responsibility to give some minimal details of the quality of the evidence behind the recommendation. These authors do not.

When I accessed the article through my University library, I immediately encountered intheOpening of the introduction:

 On September 5, 2012, a Canadian national news magazine ran a cover story entitled “Mental Health Crisis on Campus: Canadian students feel hopeless, depressed, even suicidal” (1 ). The story highlighted a 2011 survey at University of Alberta in which over 50% of 1600 students reported feeling hopeless and overwhelming anxiety over the past 12 months. The story continued by recounting incidents of suicide across Canadian campuses. The following month, the CBC reported a survey conducted at another Canadian university indicating that 88.8% of the students identified feeling generally overwhelmed, 50.2% stated that they were overwhelmed with anxiety, 66.1% indicated they were very sad, and 34.2% reported feeling depressed (2 ).

These are startling claims and they require evidence. Unfortunately the only evidence that is provided is to secondary news sources.

Authors making such strong claims in a peer-reviewed article have responsibility provide appropriate documentation. In this particular case, I don’t believe that such extreme statements even belong in a supposedly scholarly peer-reviewed article.

A section headed Data Analysis seem to provide encouragement that the authors knew what they were doing.

 A meta-analysis was conducted to pool change in the primary outcome (self-reported anxiety) and secondary outcomes (self-reported depression and salivary cortisol level) from baseline to the post-intervention period using Comprehensive Meta-analysis software, version 2.0. All data were continuous and analyzed by measuring the standard mean difference between the treatment and comparison groups based on the reported means and standard deviations for each group. Standard mean differences (SMD) allowed for comparisons to be made across studies when scales measured the same outcomes using different standardized instruments, such as administering the STAI or the PSS to measure anxiety. Standard mean differences were determined by calculating the Hedges’ g ( ). The Hedges’ g is preferable to Cohen’s d in this instance, as it includes an adjustment for small sample bias. To pool SMDs, inverse variance methods were used to weigh each effect size by the inverse of its variance to obtain an overall estimate of effect size. Standard differences in means (SDMs) point estimates and 95% confidence intervals (CIs) were computed using a random effects model. Heterogeneity between studies was calculated using I 2 ( ). This statistic provides an estimate of the percentage of variability in results across studies that are likely due to treatment effect rather than chance ( ).

Unfortunately, anyone can download for a free trial the comprehensive meta-analysis software and a newer version 3.0 and get the manual with it []. The software is easy to use, perhaps too easy. One can use it to write a paper without really knowing much about conducting and interpreting a meta-analysis. You could put garbage into it, and the software would not register a protest.

The free manual provides text that could be paraphrased without knowing too much about meta-analysis.

When I’m evaluating a meta-analysis, I quickly go to the table of studies that were included. In the case of this meta-analysis, I immediately saw a problem in the description of the first study:


The sample size of 12 students assigned to the intervention and 7 to the control group was much too small to be taken seriously. Any effect size would be unreliable, and could change drastically with adding or subtracting one study participant. Because the study had been published, it undoubtedly claimed a positive effect, that would undoubtedly represent something dodgy haven’t been done. Moreover, if you have seven participants in the control group, and you get significant results, the effect size will be quite large, because it takes a large effect to get statistical significance with only seven participants in the control group.

Reviewing the rest of the table, I can see that the bulk of the 24 studies that were included were similarly small, with only a few being my usual requirement to be taken seriously of at least 35 participants in the smaller of the intervention or control group. Having 35 participants only gives a researcher a 50% probability of detecting a moderate size effect of the intervention if it is present. If a literature is generating significant moderate-sized effects more than 50% of the time with such small studies, it is seriously flawed.

Nowhere do the authors tell us if the numbers they give in this table represent patients that were randomized or patients for whom data were available at the completion of the study. Most of the studies do not have a 50:50 ratio of intervention to control participants. Was that deliberate, by design, or always that arrived at by loss of participants?

The gold standard for a RCT is an intention-to-treat analysis, in which all patients who were randomized data available for follow-up, or some acceptable procedure has been used to estimate their missing data.

It is absolutely important that meta-analyses indicate whether or not the results of the RCTs that were entered into them were from intention-to-treat analysis.

It considered a risk of bias for an RCT not to be able to provide intention-to-treat analysis.

It is absolutely required that meta-analyses provide ratings of risk of bias by any of a standard set of procedures. I was reassured when I saw that the authors of the meta-analysis stated: “The assessment of the methodological quality of each study was based on criteria established in the Cochrane Collaboration Handbook.” Yet searching, nowhere could I see further mention of these assessments or how they had been used in the meta-analysis, if at all. I was left puzzling. Did the authors not do such risk-of-bias assessments, despite having said they had conducted them? Did they for some reason leave them out of the paper? Why didn’t an editor or reviewer catch this discrepancy?

gavel_side_md_clr-gifc200Okay, case closed. I don’t recommend giving serious consideration to a meta-analysis that depends so heavily on small studies. I don’t recommend giving serious consideration to a meta-analysis that does not take risk of bias into account, particularly when there is some concern that the studies available may not be of the best quality. Readers are welcome to complain to me that I been too harsh in evaluating the study. However, the authors are offering policy recommendations, claiming the authority of a meta-analysis, and they have not made a convincing case that the literature was appropriate or that their analyses were appropriate.

I’m sorry, but it is my position that people publishing papers in making any sort of claims have a responsibility to know what they are doing. If they don’t know, they should be doing something else.

And I don’t know why the editor or reviewers did not catch the serious problems. Journal of Affective Disorders is a peer-reviewed, Elsevier journal. Elsevier is a hugely profitable publishing company, which justifies the high cost of subscriptions because of the assurance that it gives of quality peer review. But this is not the first time that Elsevier has let us down.

A systematic review of mindfulness-based stress reduction for fibromyalgia that I really like

melbourneI am preparing a keynote address, Mindfulness Training for Physical Health Problems, to deliver at the World Congress of Behavioural and Cognitive Therapies in Melbourne on Friday, June 24, 2016. I have been despairing about the quality of both the clinical trials and systematic reviews of mindfulness treatments that I have been encountering.

Mindfulness training, mindfulness-based cognitive therapy, and mindfulness-based stress reduction (MBSR) are hot topics. That means that studies get published with obvious methodological problems ignored,  and premature and exaggerated claims are rewarded. Articles are  prone to spin and confirmation bias, not only in the reporting the results of a particular study, but also in what past studies get cited or are buried,  depending on whether they support all the enthusiasm.

bandwagon1_original_original_original_crop_northIt is difficult to get a fair evidence-based appraisal of MBSR for clinicians, patients, and policy makers.  There is a bandwagon rushing far ahead of what best evidence supports. Aside from all the other problems, the literature is being hijacked by enthusiastic promoters with undisclosed conflicts of interest who hype what they don’t tell us they are offering for sale elsewhere. Clear declarations of conflicts of interest, please.

We know that MBSR is better than no treatment. But there is only weak and inconsistent evidence telling us whether MBSR is better than other active treatments delivered with the same intensity and the same positive expectations.

Most often, MBSR is compared to a waitlist control or treatment as usual. Depending on the context, we don’t know if these control groups are actually the opposite of placebos,  nocebos Patients agreed to participate in a study that gave them a chance to get MBSR. Left in the (unblinded) control condition, they got nothing except having to be assessed repeatedly. They are going to be disappointed and this reaction is going to register in the self-report outcome data received from them.

Also, the ill-described routine care or treatment as usual as being provided may be so inadequate that we are only witnessing MBSR compensating for poor quality treatment, rather than making an active contribution of its own. This was particularly true when mindfulness training was used to taper patients from antidepressants. Patients receiving MBSR and tapering were compared to patients remaining in the routine care in primary care in which they had placed on antidepressants some time ago. At the time of recruitment, many patients were simply being ignored with minimal or no monitoring of theyor whe were taking their medication or they were even still depressed. It’s not clear whether reassessing whether the medication is still being of any benefit and providing support for tapering would’ve accomplished as much or than MBSR accomplished, without requiring daily practice or a full day retreat.

Data describing treatment as usual or routine care control conditions are readily available, but almost never reported in studies of evaluating MSB are.

To take another example, when patients with chronic back pain are being recruited  from primary care, their long term routine care eventually lacks support, positive expectations, or encouragement and may become iatrogenic  because guidelines requiring escalating futile interventions. Here too, putting back in some support and realistic expectations may work as well as more complicated n interventions.

Some members of the audience in Melbourne surely anticipate a relentlessly critical perspective on mindfulness from me. They will be surprised when I present limitations of current literature, but also positive recommendations for how future studies can be improved.

We need less research evaluating MBSR, but of a better quality.

There is far too much bad mindfulness research being done and uncritically cited and being put into systematic reviews. A meta-analysis cannot overcome the limitations of individual trials, if the bulk of the studies being integrated share the same problems. Garbage in, garbage out is a bit too harsh, but communicates a valid concern.

I think it is very important that meta analysis  with a hot topic like MBSR not become overly focused on summary effect sizes. Such effect sizes are inevitably inflated because of a dependence on at best a few small studies with a high risk of bias, which includes the allegiance of overenthusiastic investigators. These effect sizes are best ignored. It is better instead to identify the gaps and limitations in the existing literature, and how they can be corrected.

Stumbling on a quality review of MBSR for fibromyalgia.

I was quite pleased to stumble upon a review and meta-analysis of MBSR for fibromyalgia. Although it is published in a pay walled journal, a PDF is available at ResearchGate.

 Lauche R, Cramer H, Dobos G, Langhorst J, Schmidt S. A systematic review and meta-analysis of mindfulness-based stress reduction for the fibromyalgia syndrome. Journal of Psychosomatic Research. 2013 Dec 31;75(6):500-10.

Here’s the abstract:

Objectives: This paper presents a systematic review and meta-analysis of the effectiveness of mindfulness-based stress reduction (MBSR) for FMS.

Methods: The PubMed/MEDLINE, Cochrane Library, EMBASE, PsychINFO and CAMBASE databases were screened in September 2013 to identify randomized and non-randomized controlled trials comparing MBSR to control interventions. Major outcome measures were quality of life and pain; secondary outcomes included sleep quality, fatigue, depression and safety. Standardized mean differences and 95% confidence intervals were calculated.

Results: Six trials were located with a total of 674 FMS patients. Analyses revealed low quality evidence for shortterm improvement of quality of life (SMD=−0.35; 95% CI−0.57 to−0.12; P=0.002) and pain (SMD=−0.23; 95% CI −0.46 to −0.01; P=0.04) after MBSR, when compared to usual care; and for short-term improvement of quality of life (SMD=−0.32; 95% CI −0.59 to −0.04; P=0.02) and pain (SMD=−0.44; 95% CI −0.73 to −0.16; P=0.002) after MBSR, when compared to active control interventions. Effects were not robust against bias. No evidence was further found for secondary outcomes or long-term effects of MBSR. Safety data were not reported in any trial.

Conclusions: This systematic review found that MBSR might be a useful approach for FMS patients. According to the quality of evidence only a weak recommendation for MBSR can be made at this point. Further high quality RCTs are required for a conclusive judgment of its effects.

I will be blogging about MBSR for fibromyalgia in the future, but now I simply want to show off the systematic review and meta-analysis and point to some of its unusual strengths.

fibromyeliaA digression: What is fibromyalgia?

Fibromyalgia syndrome is a common and chronic disorder characterized by widespread pain, diffuse tenderness, and a number of other symptoms. The word “fibromyalgia” comes from the Latin term for fibrous tissue (fibro) and the Greek ones for muscle (myo) and pain (algia).

Although fibromyalgia is often considered an arthritis-related condition, it is not truly a form of arthritis (a disease of the joints) because it does not cause inflammation or damage to the joints, muscles, or other tissues. Like arthritis, however, fibromyalgia can cause significant pain and fatigue, and it can interfere with a person’s ability to carry on daily activities. Also like arthritis, fibromyalgia is considered a rheumatic condition, a medical condition that impairs the joints and/or soft tissues and causes chronic pain.

criteria fibromyalgiaYou can find out more about fibromyalgia from a fact sheet from the US National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMSD)  that is only minimally contaminated by outdated notions of fibromyalgia being a psychosomatic condition, i.e., all in the head, or recommendations for unproven complementary and alternative medicines.

 What I like about this systematic review and meta-analysis.

 The authors convey familiarity with the standards for conducting and reporting systematic reviews and meta-analyses, recommendations for the grading of evidence, and guidelines specific to the particular topic, fibromyalgia. They also admit that they had not registered their protocol. No one is perfect, and it is important for authors to indicate that they are aware of standards, even when they do not meet them. Readers can decide for themselves how to take this into account.

This review was planned and conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses. guidelines (PRISMA) [15], the recommendations of the Cochrane Musculoskeletal Group [16,17] and the GRADE recommendations (Grading of Recommendations Assessment, Development and Evaluation) [18]. The protocol was not registered in any database.

The authors also laid out key features of systematic review and meta-analyses where you would expect to find them with explicit headings: eligibility criteria, search strategy, study selection and data collection including risk of bias in individual studies, etc.

Designation of primary and secondary outcomes.

Fibromyalgia causes pain and fatigue, disrupting quality of life. These are the outcomes in which patients and their healthcare providers will be most interested. Improvement in pain should be given the priority. However, in clinical trials of MBSR for fibromyalgia ,investigators often administer a full battery of measures, and select the ones that are positive, even if they’re not the outcomes that will be most important to patients and providers. For instance, the first report from one trial focused on depressive symptoms . Designating depressive symptoms as the primary outcome ignored that  not all patients with fibromyalgia have heightened depressive symptoms, and depression is not their primary concern. Moreover, the paper reporting this clinical trial is inconsistent with  its regiistration, where a full range of other outcomes were designated as primary. Ugh, such papers defeat the purpose of having their protocols registered.

In the review under discussion, depressive symptoms were designated as a secondary outcome, along with sleep and fatigue.

 Compared to what?

The review clearly distinguished waitlist/routine care from active comparison treatments and provide separate effect sizes.

The review also indicated whether the patients had been randomized to MBSR versus comparison treatment, and explicitly indicated that any significant effects for MBSR are disappeared when only randomized trials were considered.

 Strength of recommendation.

The review took into account the small number of studies (4 randomized and 2 non-randomized trials  with a total of 674 patients) and the low quality of evidence in grading the recommendation that it was making:

According to GRADE, only a weak recommendation could be made for the use of MBSR for FMS, mainly due to the small number of studies and low quality of evidence.

 Summary of main results.

The article produces a series of forest plots [How to read one ] that graphically display the unambiguous results showing weak effects of mindfulness in the short-term but none in the long term. For instance:

pain short and long

This meta-analysis found low quality evidence for small effects of MBSR on quality of life and pain intensity in patients with fibromyalgia syndrome, when compared to usual care control groups or active control groups. Effects however were not robust against bias. Finally, data on safety were not reported in any study.

 Agreements and disagreements with other systematic reviews.

The few other reviews of MBSR fibromyalgia are of poor quality. So, the authors of  this review discusses results in the context of the larger literature of MBSR for physical health problems.

 Implication for further research.

Too often, reviews of fashionable psychological interventions for health problems end with an obligatory positive assessment and, of course, “further research is needed.”

Enthusiasts assume MSBR is that it is good for whatever ails you. MBSR training can help you cope, if it doesn’t actually address your physical health problem. I really liked that this review gave pause and reflected on why MBSR should be expected to be the treatment of choice and to make sure that relevant process and outcome variables are being assessed.

Patients with fibromyalgia are seek to relieve their debilitating pain and accompanying fatigue, or at least resume some semblance of a normal life they have lost their condition. It is important that results of MBSR research allow informed decisions about whether it is worth the effort to patients and providers to get involved in MBSR or whether it would simply be more more burden with uncertain  results.

One major implication for future research is that researchers should bear in mind that MBSR primarily aims to establish a mindful and accepting pain coping style rather than to reduce the intensity of pain or other complaints. Therefore researchers are encouraged to select custom outcomes such as awareness, acceptance or coping rather than intensity of symptom which might not reflect the intention of the intervention. Only two trials measured coping, however, only one of them actually reported results and the other one [47] did not provid data but stated that besides catastrophizing there were no significant group differences. Results of the trial by Grossmann et al. [49] on the other hand indicated significant improvements on several subscales, which could be worth further investigations.

Further high quality RCTs comparing MBSR to established therapies
(e.g. defined drug treatment, cognitive behavioral therapy) are also required
for the conclusive judgment.


This systematic review found low quality evidence for a small short term improvement of pain and quality of life after MBSR for fibromyalgia, when compared to usual care or active control interventions. No evidence was found for long-term effects.

Not much spin here or basis for yet recommending MBSR for fibromyalgia  as ready for  implementing in routine care.


Probing an untrustworthy Cochrane review of exercise for “chronic fatigue syndrome”

Updated April 24, 2016, 9:21 AM US Eastern daylight time: An earlier version of this post had mashed together discussion of the end-of-treatment analyses with the follow-up analyses. That has now been fixed. The implications are even more serious for the credibility of this Cochrane review.

From my work in progress

worse than

My ongoing investigation so far has revealed that a 2016 Cochrane review misrepresents how the review was done  and what was found in key meta analyses. These problems are related to an undeclared conflict of interest.

The first author and spokesperson for the review, Lillebeth Larun is also the first author on the protocol for a Cochrane review that has not yet been published.

Larun L, Odgaard-Jensen J, Brurberg KG, Chalder T, Dybwad M, Moss-Morris RE, Sharpe M, Wallman K, Wearden A, White PD, Glasziou PP. Exercise therapy for chronic fatigue syndrome (individual patient data) (Protocol). Cochrane Database of Systematic Reviews 2014, Issue 4. Art. No.: CD011040.

At a meeting organized and financed by PACE investigator Peter White, Larun obtained privileged access to data that the PACE investigators have spent tens of thousands of pounds to keep most of us from viewing. Larun used this information to legitimize outcome switching or p-hacking favorable to the PACE investigators’ interests. The Cochrane review  misled readers in presenting how some analyses were conducted that were crucial to its conclusions.

One of the crucial function of Cochrane reviews is to protect policymakers, clinicians, researchers, and patients from the questionable research practices utilized by trial investigators to promote particular interpretation of their results. This Cochrane review fails miserably in this respect. The Cochrane is complicit in endorsing the PACE investigators’ misinterpretation of their findings.

A number of remedies should be implemented. The first could be for Cochrane Editor in Chief and Deputy Chief Director Dr. David Tovey to call publicly for release for independent reanalysis of the PACE trial data from The Lancet original outcomes paper and the follow-up data reported in Lancet Psychiatry.

Given the breach in trust with the readership of Cochrane that has occurred, Dr. Tovey should announce that the individual patient-level data used in the ongoing review will be released for independent re-analysis.

Larun should be removed from the Cochrane review that is in progress. She should recuse herself from further comment on the 2016 review. Her misrepresentations and comments thus far have tarnished the Cochrane’s reputation for unbiased assessment and correction when mistakes are made.

An expression of concern should be posted for the 2016 review.

The 2016 Cochrane review of exercise for chronic fatigue syndrome:

 Larun L, Brurberg KG, Odgaard-Jensen J, Price JR. Exercise therapy for chronic fatigue syndrome. Cochrane Database Syst Rev. 2016; CD003200.

Added only three studies that were not included in a 2004 Cochrane review of five studies:

Wearden AJ, Dowrick C, Chew-Graham C, Bentall RP, Morriss RK, Peters S, et al. Nurse led, home based self help treatment for patients in primary care with chronic fatigue syndrome: randomised controlled trial. BMJ 2010; 340 (1777):1–12. [DOI: 10.1136/bmj.c1777]

Hlavaty LE, Brown MM, Jason LA. The effect of homework compliance on treatment outcomes for participants with myalgic encephalomyelitis/chronic fatigue syndrome. Rehabilitation Psychology 2011;56(3):212–8.

White PD, Goldsmith KA, Johnson AL, Potts L, Walwyn R, DeCesare JC, et al. Comparison of adaptive pacing therapy, cognitive behaviour therapy, graded exercise therapy, and specialist medical care for chronic fatigue syndrome (PACE): a randomised trial. The Lancet 2011; 377:611–90.

This blog post concentrates on sub analyses that is crucial to the conclusions of the 2016 review reported on pages  68 and 69, Analyses 1.1 and 1.2.

I welcome others to extend this scrutiny to other analyses in the review, especially those for the SF-36 (parallel Analyses 1.5 and 1.6).

Analysis 1.1. Comparison 1 Exercise therapy versus treatment as usual, relaxation or flexibility, Outcome 1 Fatigue (end of treatment).

The only sub analysis that involves new studies includes Wearden et al. FINE trial, White et al. PACE trial and an earlier study, Powell et al. The meta-analysis gives 27.2% weight to Wearden et al and 62.9% weight to White et al.or a 90.1% weight to the pair.

 Inclusion of the Wearden et al FINE trial in the meta-analysis

The Cochrane review evaluates risk of bias for Wearden et al. on page 49:

Wearden selective reporting

This is untrue.

Cochrane used a ‘Likert’ scoring method (0,1,2,3), but  the original Wearden et al. paper reports using the…

11 item Chalder et al fatigue scale,19 where lower scores indicate better outcomes. Each item on the fatigue scale was scored dichotomously on a four point scale (0, 0, 1, or 1).

This would seem a trivial difference, but this outcome switching will take on increasing importance as we proceed.

Based on a tip from Robert Courtney. I found the first mention of a re-scoring of the Chalder fatigue scale in the Weardon  study in a BMJ Rapid Response:

 Wearden AJ, Dowrick C, Chew-Graham C, Bentall RP, Morriss RK, Peters S, et al. Nurse led, home based self help treatment for patients in primary care with chronic fatigue syndrome: randomised controlled trial. BMJ, Rapid Response 27 May 2010.

The explanation that was offered for the re-scoring in the Rapid Response was:

Following Bart Stouten’s suggestion that scoring the Chalder fatigue scale (1) 0123 might more reliably demonstrate the effects of pragmatic rehabilitation, we recalculated our fatigue scale scores.

“Might reliably demonstrate…”?  Where I come from, we call this outcome switching,  p-hacking, a questionable research practice, or simply cheating.

In the original reporting of the trial, effects of exercise were not significant at follow-up. With the rescoring of the Chalder fatigue scale, these results now become significant.

A  physician who suffers from myalgic encephalomyelitis (ME) – what both the PACE investigators and Cochrane review term “chronic fatigue syndrome” – sent me the following comment:

I have recently published a review of the PACE trial and follow-up articles and according to the Chalder Fatigue Questionnaire, when using the original bimodal scoring I only score 4 points, meaning I was not ill enough to enter the trial, despite being bedridden with severe ME. After changing the score in the middle of the trial to Likert scoring, the same answers mean I suddenly score the minimum number of 18 to be eligible for the trial yet that same score of 18 also meant that without receiving any treatment or any change to my medical situation I was also classed as recovered on the Chalder Fatigue Questionnaire, one of the two primary outcomes of the PACE trial.

So according to the PACE trial, despite being bedridden with severe ME, I was not ill enough to take part, ill enough to take part and recovered all 3 at the same time …

Yet according to Larun et al. there’s nothing wrong with the PACE trial.

Inclusion of the White et al PACE trial in the meta-analysis

Results of the Wearden et al FINE trial were available to the PACE investigators when they performed the controversial switching  of outcomes for their trial. This should be taken into account in interpreting Larun’s defense of the PACE investigators in response to a comment from Tom Kindlon. She stated:

 You particularly mention the risk of bias in the PACE trial regarding not providing pre-specified outcomes however the trial did pre-specify the analysis of outcomes. The primary outcomes were the same as in the original protocol, although the scoring method of one was changed and the analysis of assessing efficacy also changed from the original protocol. These changes were made as part of the detailed statistical analysis plan (itself published in full), which had been promised in the original protocol. These changes were drawn up before the analysis commenced and before examining any outcome data. In other words they were pre-specified, so it is hard to understand how the changes contributed to any potential bias.

I think that what we have seen here so far gives us good reason to side with Tom Kindlon versus Lillebeth Larun on this point.

Also relevant is an excellent PubMed Commons comment by Sam Carter, Exploring changes to PACE trial outcome measures using anonymised data from the FINE tria. His observations about the Chalder fatigue questionnaire:

White et al wrote that “we changed the original bimodal scoring of the Chalder fatigue questionnaire (range 0–11) to Likert scoring to more sensitively test our hypotheses of effectiveness” (1). However, data from the FINE trial show that Likert and bimodal scores are often contradictory and thus call into question White et al’s assumption that Likert scoring is necessarily more sensitive than bimodal scoring.

For example, of the 33 FINE trial participants who met the post-hoc PACE trial recovery threshold for fatigue at week 20 (Likert CFQ score ≤ 18), 10 had a bimodal CFQ score ≥ 6 so would still be fatigued enough to enter the PACE trial and 16 had a bimodal CFQ score ≥ 4 which is the accepted definition of abnormal fatigue.

Therefore, for this cohort, if a person met the PACE trial post-hoc recovery threshold for fatigue at week 20 they had approximately a 50% chance of still having abnormal levels of fatigue and a 30% chance of being fatigued enough to enter the PACE trial.

A further problem with the Chalder fatigue questionnaire is illustrated by the observation that the bimodal score and Likert score of 10 participants moved in opposite directions at consecutive assessments i.e. one scoring system showed improvement whilst the other showed deterioration.

Moreover, it can be seen that some FINE trial participants were confused by the wording of the questionnaire itself. For example, a healthy person should have a Likert score of 11 out of 33, yet 17 participants recorded a Likert CFQ score of 10 or less at some point (i.e. they reported less fatigue than a healthy person), and 5 participants recorded a Likert CFQ score of 0.

The discordance between Likert and bimodal scores and the marked increase in those meeting post-hoc recovery thresholds suggest that White et al’s deviation from their protocol-specified analysis is likely to have profoundly affected the reported efficacy of the PACE trial interventions.

Compare White et al.’s “more sensitively test our hypotheses” to Weardon et al.’s ““might reliably demonstrate…” explanation for switching outcomes.

A correction is needed to this assessment of risk of bias in the review for the White et al PACE trial.white study bias

A figure on page 68 shows results of a subanalysis with the switched outcomes at the end of treatment.

analysis 1.1 end of treatment

This meta analyses concludes that exercise therapy produced an almost 3 point drop in fatigue on the rescored Chalder scale at the end of treatment.

Analysis 1.2. Comparison 1 Exercise therapy versus treatment as usual, relaxation or flexibility, Outcome 2 Fatigue (follow-up).

A table on page 69 shows results of a subanalysis with the switched outcomes at follow up:

analyses 1.2 follow up

This meta analysis entirely depends on the revised scoring of the Chalder fatigue scale and the FINE and PACE trial. It suggests that the three point drop in fatigue persists at followup.

But Cochrane should have stuck with the original primary outcomes specified in the original trial registrations. That would have been consistent what with the Cochrane usually does, what is says it did here,  and what its readers expect.

Readers were not at the meeting that the PACE investigators financed and cannot get access to the data on which the Cochrane review depends. So they depend on Cochrane as a trusted source.

I am sure the results would be different if the expected and appropriate procedures had been followed. Cochrane should alert readers with an Expression of Concern until the record can be corrected or the review retracted.

 Now what?

get out of bedIs it too much to ask that Cochrane get out of bed with the PACE investigators?

What would Bill Silverman say? Rather than speculate about someone who neither Dr.Tovey or I have ever met, I ask Dr Tovey “What would Lisa Bero say?”


My response to an invitation to improve the Cochrane Collaboration by challenging its policies

I interpret a recent Cochrane Community Blog post as inviting me to continue criticizing the Collaboration’s conflict of interest in the evaluation of “chronic fatigue syndrome” with the intent of initiating further reflection on its practices and change.

Cochrane needs to

  • Clean up conflicts of interest in its systematic reviews.
  • Issue a Statement of Concern about a flawed and conflicted review of exercise for chronic fatigue syndrome.

cochrane communityI will leave for a future blog the argument that Cochrane needs to take immediate steps to get the misnamed “chronic fatigue syndrome” out of its Common Mental Disorders group. The colloquialism throws together highly prevalent complaints in primary care of tiredness with less common, but more serious myalgic encephalomyelitis, which is recognized by the rest of the world as a medical  condition, not a mental disorder.

But I think I call attention in this blog post to enough that needs change now.

The invitation from the Cochrane Community Blog to criticize its policies

I had a great Skype conference with Dr. David Tovey, Cochrane Editor in Chief and Deputy Chief Director. I’m grateful for his reaching out and his generous giving of his time, including reading my blog posts ahead of time.

In the email setting up the conversation, Dr.Tovey stated that Cochrane has a tradition of encouraging debate and that he believes that criticism helps them to improve. That is something he is very keen to foster.

Our conversation was leisurely and wide-ranging. Dr.Tovey lived up to the expectations established in his email. He said that he was finishing up a blog post in response to issues that I and others had raised. That blog post is now available here. It leads off with:

 I didn’t know Bill Silverman, so I can’t judge whether he would be “a-mouldering in his grave”. However, I recognise that James Coyne has set down a challenge to Cochrane to explain its approach to commercial and academic conflicts of interest and also to respond to criticisms made in relation to the appraisal of the much debated PACE study.

Dr. Tovey closed his blog post with:

 Cochrane is not complacent. We recognise that both we and the world we inhabit are imperfect and that there is a heavy responsibility on us to ensure that our reviews are credible if they are to be used to guide decision making. This means that we need to continue to be responsive and open to criticism, just as the instigators of the Bill Silverman prize intended, in order “to acknowledge explicitly the value of criticism of The Cochrane Collaboration, with a view to helping to improve its work.”

 As a member of a group of authors who received the Bill Silverman prize, I am interpreting Dr. Tovey’s statement as an invitation to improve the Cochrane collaboration by instigating and sustaining a discussion of its handling of conflicts of interest in reviews of the misnamed “chronic fatigue syndrome.”

I don’t presume that Dr. Tovey will personally respond to all of my efforts. But I will engage him and hope that my criticisms and concerns will be forwarded to appropriate deliberative bodies and receive wider discussion within the Cochrane.

For instance, I will follow up on his specific suggestion by filing a formal complaint with Funding Arbiters and Arbitration Panel concerning a review and protocol with Lillebeth Larun as first author.

 A flawed and conflicted Cochrane systematic review

 There are numerous issues that remain unresolved in a flawed and conflicted recent Cochrane systematic review:

 Larun L, Brurberg KG, Odgaard-Jensen J, Price JR. Exercise therapy for chronic fatigue syndrome. Cochrane Database Syst Rev. 2016; CD003200.

As well as a protocol for a future review:

Larun L, Odgaard-Jensen J, Brurberg KG, Chalder T, Dybwad M, Moss-Morris RE, Sharpe M, Wallman K, Wearden A, White PD, Glasziou PP. Exercise therapy for chronic fatigue syndrome (individual patient data) (Protocol). Cochrane Database of Systematic Reviews 2014, Issue 4. Art. No.: CD011040.

I’m pleased that Dr. Tovey took a stand against the PACE investigators and Queen Mary University, London. He agreed sharing patient-level data for a Cochrane Review on which they were authors should not be used as an excuse to avoid sharing data with others. .

 Another issue raised by Coyne has also been raised with me in personal correspondence: namely the perceived use of Cochrane as a rationale for withholding clinical trials data at the level of individual patients from other individuals and organisations. Cochrane is a strong supporter and founding member of the AllTrials initiative and is committed to clinical trials transparency. Cochrane does not believe that sharing data with its researchers is an appropriate rationale for withholding the data from alternative researchers. Each application must be judged independently on its merits. Cochrane has issued a public statement that details our position on access to trial data.

I hope that Dr.Tovey’s sentiment was formally communicated to the Tribunal deliberating an appeal by the PACE investigators of a decision by the UK Information Commissioner that the trial data must be released to someone who had requested it.

I also hope that Dr. Tovey and the Cochrane recognize the implications of the PACE investigators thus far only being willing to share their data when they have authorship and therefore some control over the interpretation of their data.  As Dr.Tovey notes, simply providing data does not meet the conditions for authorship:

 It is also important that all authors within a review team meet the requirements of the International Committee of Medical Journal Editors (ICMJE) in relation to authorship.

These requirements mean that all authors must approve the final version of the manuscript before it is submitted. This allows the PACE investigators to control the conclusions of the systematic review so that they support the promotion of cognitive behavior and graded exercise therapy as the most evidence-supported treatments for chronic fatigue syndrome.

A favorable evaluation by the Cochrane will greatly increase the value of the PACE group’s consultations, including  recommendations that disabled persons be denied benefits if they do not participate in these “best-evidence”interventions.

I’m pleased that Dr. David Tovey reiterated the Cochrane’s strong position on disclosures of conflict of interest being necessary but not sufficient to ensure the integrity of systematic reviews:

 Cochrane is still fairly unusual within the journal world in that it specifies that in some cases declaration of interests is necessary but insufficient, and that there are individuals or groups of researchers who are not permitted to proceed with a given systematic review.

Yet, I’m concerned that in considering the threat of disclosed and undisclosed conflicts of interest, Dr. Tovey and the Cochrane narrowly focus on Pharma and medical device manufacturers, to the exclusion of other financial ties, such as the large disability re-insurance industry:

 Within the 2014 policy it was made explicit that review authors could not be employed by pharmaceutical companies, device manufacturers or individuals that were seeking or holding a patent relevant to the intervention or a comparator product. Furthermore, in all cases, review author teams are required to have a majority of non-conflicted authors and the lead author should also be non-conflicted. The policy is available freely.

[The Cochrane apparently lacks an appreciation of the politics and conflicts of interest of the PACE trial. The trial has the unusual if not unique distinction of being a psychotherapy trial funded in part by the UK Department of Work and Pensions, which had a hand in its design. It’s no accident that the PACE investigators include paid consultants to the re-insurance industry. For more on this mess, see The Misleading Research at the Heart of Disability Cuts.

nothing to declareIt also doesn’t help that the PACE investigators routinely fail to declare conflicts of interest. They failed to disclose their conflicts of interest to patients being recruited for the study. They failed again until they were caught in declaring no conflicts of interest in a protocol for another systematic review.

Dr. Tovey goes on to state:

Authors of primary studies should not extract data from their own study or studies. Instead, another author(s) or an editor(s) should extract these data, and check the interpretation against the study report and any available study registration details or protocol.

The  Larun et al systematic review of graded exercise therapy violates this requirement.  The meta-analyses forming the basis of this review is not reproducible from the published registrations, original protocols, and findings of the original studies.

Dr. Tovey is incorrect on one point:

 James Coyne states that Lillebeth Larun is employed by an insurance company, but I am unclear on what basis this is determined. Undeclared conflicts of interest are a challenge for all journals, but when they are brought to our attention, they need to be verified. In any case, within Cochrane it would be a matter for the Funding Arbiters and Arbitration Panel to determine whether this was a sufficiently direct conflict to disbar her from being first author of any update.

I can’t find anywhere that I have said that Lillebeth Larun is employed by an insurance company. But I did say that she has undeclared conflicts of interest.  These echo in her distorted judgments and defensive responses to criticisms of decisions made in the review that favor the PACE investigators’ vested interest.

Accepting  Dr. Toby’s suggestion, I’ll be elaborating my concerns in a formal complaint to Cochrane’s Funding Arbiters and Arbitration Panel. But here is a selection of what I previously said:

Larun dismisses the risk of bias associated with the investigators not sticking to the primary outcomes in their original protocol. She suggested deviations from these outcomes were specified before analyses commenced. However, this was an unblinded trial and the investigators could inspect incoming data. In fact, they actually sent out a newsletter to participants giving testimonials about the benefits of the trial while they were still recruiting patients. Think of it: if someone with ties to the pharmaceutical industry could peek at incoming data and make changes to designate outcomes, wouldn’t that be a high risk of bias? Of course.

Laurun was responding to an excellent critique of the published review by Tom Kindlon, which you can find here.

Other serious problems with the review are hidden from the casual reader. In revising their primary outcomes specified in the original proposal, the PACE investigators had access to the publicly available data from the sister FINE trial (Weardon, 2010).

 Wearden AJ, Dowrick C, Chew-Graham C, Bentall RP, Morriss RK, Peters S, Riste L, Richardson G, Lovell K, Dunn G; Fatigue Intervention by Nurses Evaluation (FINE) trial writing group and the FINE trial group. Nurse led, home based self help treatment for patients in primary care with chronic fatigue syndrome: randomised controlled trial. BMJ. 2010 Apr 23;340:c1777. doi: 10.1136/bmj.c1777.

These data from the FINE trial clearly indicated that the existing definition of the primary outcomes in the PACE trial registration would likely not provide evidence of the efficacy of cognitive behavior or graded exercise therapy. Not surprisingly, the PACE investigators revised their scoring of primary outcomes.

Moreover, the Larun et al review misrepresents how effect sizes for the FINE trial were calculated. The review wrongly claimed that only protocol-defined and published data or outcomes were used for analysis of the Wearden 2010 study.

Robert Courtney documents in a pending comment that the review relied on an alternative unpublished set of data. As Courtney points out, the differences are not trivial.

Yet, the risk of bias table in the review for the Wearden study states:

Wearden selective reporting

Financial support for a meeting between Dr. Lillebeth Larun and PACE investigators

The statement of funding for the 2014 protocol indicates that Peter White financed meetings at Queen Mary University in 2013. If this were a Pharma-supported 2016 systematic review, wouldn’t Lauren have to disclose a conflict of interest for attendance at the the 2014 meeting sponsored by PACE investigators?

Are these meetings the source of the acknowledgment in the 2016 systematic review?

We would like to thank Peter White and Paul Glasziou for advice and additional information provided. We would also like to thank Kathy Fulcher, Richard Bentall, Alison Wearden, Karen Wallman and Rona Moss-Morris for providing additional information from trials in which they were involved.

The declared conflicts of interest of the PACE investigators in The Lancet paper constitute a high risk of bias. I am familiar with this issue because our article which won the Bill Silverman Award highlighted the importance of authors’ conflicts of interest being associated with exaggerated estimates of efficacy. The award to us was premised on our article having elicited a change in Cochrane policy. My co-author Lisa Bero wrote an excellent follow-up editorial for Cochrane on this topic.

 This is a big deal and action is needed

 Note that this 2016 systematic review has only three new studies considered that were not included in the 2004 review. So, the misrepresentations and incorrect calculation of effect sizes for two  added trials– PACE and FINE – are decisive.

As it stands, the Larun et al Cochrane Review is an unreliable summary of the literature concerning exercise for “chronic fatigue syndrome.”  Policymakers, clinicians, and patients should be warned. It serves the interests of politicians and re-insurance companies–and declared and undeclared interest of the PACE investigators.

I would recommend that Dr. Lillebeth Larun recuse herself from further commentary on the 2016 systematic review until complaints about her conflicts of interest and unreproducibility of the review are resolved. The Cochrane should also publish an Expression of Concern about the review, detailing the issues that have been identified here.

Stay tuned for a future blog post concerning the need to move “chronic fatigue syndrome” out of the Cochrane Common Mental Disorders group.



Why the Cochrane Collaboration needs to clean up conflicts of interest

  • A recent failure to correct a systematic review and meta-analysis demonstrates that Cochrane’s problem with conflict of interest is multilayered.
  • Cochrane  enlists thousands of volunteers committed to the evaluation of evidence independent of the interests of the investigators who conducted trials.
  • Cochrane is vigilant in requiring declaration of conflicts of interest but is inconsistent in policing their influence on reviews.
  • The Cochrane  has a mess to clean up.

ioannidisA recent interview of John Ioannidis by Retraction Watch expressed concern about Cochrane’s tainting by conflict of interest:

RW: You’re worried that Cochrane Collaboration reviews — the apex of evidence-based medicine — “may cause harm by giving credibility to biased studies of vested interests through otherwise respected systematic reviews.” Why, and what’s the alternative?

JI: A systematic review that combines biased pieces of evidence may unfortunately give another seal of authority to that biased evidence. Systematic reviews may sometimes be most helpful if, instead of focusing on the summary of the evidence, highlight the biases that are involved and what needs to be done to remedy the state-of-the-evidence in the given field. This often requires a bird’s eye view where hundreds and thousands of systematic reviews and meta-analyses are examined, because then the patterns of bias are much easier to discern as they apply across diverse topics in the same or multiple disciplines. Much of the time, the solution is that, instead of waiting to piece together fragments of biased evidence retrospectively after the fact, one needs to act pre-emptively and make sure that the evidence to be produced will be clinically meaningful and unbiased, to the extent possible. Meta-analyses should become primary research, where studies are designed with the explicit anticipation that they are part of an overarching planned cumulative meta-analysis.

The key points were (1) Retraction Watch is raising with John Ioannidis the concern that evidence-based medicine has been hijacked by special interest; (2) RW is specifically asking about the harm caused by the Cochrane Collaboration in lending undue credibility to studies biased by vested interest; and (3) Ioannidis replies that instead of focusing on summarizing the evidence, Cochrane should highlight biases and point to what needs to be done to produce trustworthy, clinically meaningful and unbiased assessment.

A recent exchange of comments about a systematic review and meta-analysis demonstrates the problem.

Larun L, Brurberg KG, Odgaard-Jensen J, Price JR. Exercise therapy for chronic fatigue syndrome. Cochrane Database Syst Rev. 2016; CD003200.

cochraneThe systematic review is behind a paywall. That is particularly unfortunate because persons providing systematic reviews undergo extensive training and then work for free. The fruits of their labor identifying would best be evidence should be available around the world, for free. But to see their work, one has to either go through a University library or pay a fee to the for-profit Wiley.

An abridged version of the review is available here:

Larun L, Odgaard-Jensen J, Price JR, Brurberg KG. An abridged version of the Cochrane review of exercise therapy for chronic fatigue syndrome. European Journal of Physical and Rehabilitation Medicine. 2015 Sep.

To get around the pay wall of the full review, the commentator, Tom Kindlon cleverly reposted his comment at PubMed Commons where everybody can access it for free:

In his usual polite style, Mr Kindlon opens with an expression thanks the authors of the systematic review and closes with a thanks for their reading his comments. In between, he makes a number of interesting points before getting to the following:

“Selective reporting (outcome bias)” and White et al. (2011)

I don’t believe that White et al. (2011) (the PACE Trial) (3) should be classed as having a low risk of bias under “Selective reporting (outcome bias)” (Figure 2, page 15). According to the Cochrane Collaboration’s tool for assessing risk of bias (21), the category of low risk of bias is for: “The study protocol is available and all of the study’s pre-specified (primary and secondary) outcomes that are of interest in the review have been reported in the pre-specified way”. This is not the case in the PACE Trial. The three primary efficacy outcomes can be seen in the published protocol (22). None have been reported in the pre-specified way. The Cochrane Collaboration’s tool for assessing risk of bias states that a “high risk” of bias applies if any one of several criteria are met, including that “not all of the study’s pre-specified primary outcomes have been reported” or “one or more primary outcomes is reported using measurements, analysis methods or subsets of the data (e.g. subscales) that were not pre-specified”. In the PACE Trial, the third primary outcome measure (the number of “overall improvers”) was never published. Also, the other two primary outcome measures were reported using analysis methods that were not pre-specified (including switching from the bimodal to the Likert scoring method for The Chalder Fatigue Scale, one of the primary outcomes in your review). These facts mean that the “high risk of bias” category should apply.

I’m sure John Ioannidis would be pleased with Kindlon raising this point.

In order to see the response from the author of the systematic review one has to get behind the paywall. If you do that, you can see that Lillebeth Larun reciprocates Kindlon’s politeness, agrees that some of his points should be reflected in future research, but takes issue with a key one. I haven’t asked him, but I don’t think John Ioannidis is would be happy with her response:

Selective reporting (outcome bias)

The Cochrane Risk of Bias tool enables the review authors to be transparent about their judgments, but due to the subjective nature of the process it does not guarantee an indisputable consensus. You particularly mention the risk of bias in the PACE trial regarding not providing pre-specified outcomes however the trial did pre-specify the analysis of outcomes. The primary outcomes were the same as in the original protocol, although the scoring method of one was changed and the analysis of assessing efficacy also changed from the original protocol. These changes were made as part of the detailed statistical analysis plan (itself published in full), which had been promised in the original protocol. These changes were drawn up before the analysis commenced and before examining any outcome data. In other words they were pre-specified, so it is hard to understand how the changes contributed to any potential bias. The relevant paper also alerted readers to all these changes and gave the reasons for them. Overall, we don’t think that the issues you raise with regard to the risk of selective outcome bias are such as to suspect high risk of bias, but recognize that you may reach different conclusions than us.

aaaarghI strongly take issue and see conflicts of interest rearing their ugly heads at a number of points.

  1. One can’t dismiss application of the Cochrane Risk of Bias tool as simply being subjective and then say whatever you want to say. The tool has well-specified criteria, and persons completing a review have to be trained to consensus. One of the key reasons that a single author can’t conduct a proper Cochrane collaboration review is that requires a trained team to agree on ratings of risk of bias. That’s one of the many checks and balances built into a systematic review.

Fortunately,  Cochrane  provides an important free chapter as a guide. Lots of people who conduct systematic reviews and meta-analyses who are not members of  Cochrane  nonetheless depend on the materials that the collaboration has developed because they are so clear, authoritative, and transparent in terms of the process by which they were developed.

  1. Largely as a result of our agitation,*applying the sixth of six risk of bias items (other bias) assesses whether the investigators a particular trial have a conflict of interest. The authors of the trial in question had a strong conflict of interest including paid and volunteer working for an insurance company and as assessors of disability eligibility. Ioannidis is would undoubtedly consider this as a high risk of bias.
  1. Larun dismisses the risk of bias associated with the investigators not sticking to the primary outcomes in their original protocol. She suggested deviations from these outcomes were specified before analyses commenced. However, this was an unblinded trial and the investigators could inspect incoming data. In fact, they actually sent out a newsletter to participants giving testimonials about the benefits of the trial while they were still recruiting patients. Think of it: if someone with ties to the pharmaceutical industry could peek at incoming data and make changes to designate outcomes, wouldn’t that be a high risk of bias? Of course.
  1. But it gets worse. Larun is a co-author with the investigators of the trial on another Cochrane protocol.

Larun L, Odgaard-Jensen J, Brurberg KG, Chalder T, Dybwad M, Moss-Morris RE, Sharpe M, Wallman K, Wearden A, White PD, Glasziou PP.  Exercise therapy for chronic fatigue syndrome (individual patient data) (Protocol).  Cochrane Database of Systematic Reviews 2014, Issue 4. Art. No.: CD011040.

  1. And one of the authors of the systematic review under discussion is a colleague in the department of the trial investigators.

How does Cochrane  define conflict of interest?

I’m a member of Cochrane and so I I am required to complete a yearly assessment of potential conflicts of interest. My report is kept on by the collaboration but not necessarily directly available to the public. You can download a PDF of the evaluation and an explanation here 

As you can see, Cochrane  staff and reviewers need to disclose (1) the financing of their review; (2) relevant financial activities outside the submitted work; (3) intellectual property such as patents, copyrights, and royalties; and (4) other relationships which has the instructions:

Use this section to report other relationships or activities that readers could perceive to have influenced, or that give the appearance of potentially influencing, what you wrote in the submitted work.

The conflicts of interest of Lillebeth Larun

A co-author of Lillebeth Larun on the systematic review under discussion is a colleague in the department of the investigators whose trial is being evaluated. Larun is a co-author on another protocol with these investigators. Examination of the acknowledgments that protocol indicates that the investigators provided both data and funding for meetings:

The author team held three meetings in 2011, 2012 and 2013 which were funded as follows:

  • 2011 via  Paul Glasziou, NIHR senior research fellow fund, Oxford Department of primary care.
  • 2012 via Hege R Eriksen, Uni Research Health, Bergen.
  • 2013 via Peter D White’s academic fund (Professor of Psychological Medicine, Centre for Psychiatry, Wolfson Institute of Preventive Medicine, Barts and The London School of Medicine and Dentistry, Queen Mary University of London).

So, the both the systematic review under discussion and the other protocol were conducted among “families and friends”. In dismissing concerns about risk of bias for a particular trial, Lillebeth Larun is ignoring the obvious strong bias for her associates.

She has no business conducting this review nor dismissing the high risk of bias of inclusion of their study.

So, what is going on here?

Peter White and the PACE investigator team are attempting to break down the checks and balances that a systematic review imposes on interpretation of results of clinical trials. That interpretation should be independent of the investigators who generated a trial and take into account their conflicts of interest. The PACE investigators had a conflict of interest when they generate the data and now they want to control the interpretation so that comes out in favor of their interest.

Some PACE investigators have ties to the insurance companies and they want the results to fit with the needs of these companies. Keep in mind that the insurance companies don’t necessarily care whether treatments work. Their intent is to require participation in treatment as a condition for receiving disability payments and to exclude disabled persons who want to treatment.

Cochrane collaboration takes conflict of interest seriously

A statement by the two editors heading the Cochrane Bone, Joint, and Muscle Trauma Group is quite quotable about the threats of involvement of investigators of the original trials to the integrity of systematic reviews.

Handoll H, Hanchard N. From observation to evidence of effectiveness: the haphazard route to finding out if a new intervention works. Cochrane Database of Systematic Reviews. 2014 Jan 1.

They state:

We feel should become a cardinal rule: the need to separate the clinical evaluation of innovations from their innovators, who irrespective of any of their endeavours to be ‘neutral’ have a substantial investment, whether emotional, perhaps financial, or in terms of professional or international status, in the successful implementation of their idea.

Disclosure of conflicts of interest may be insufficient to mitigate the effects:

The reporting of financial conflicts of interest in systematic reviews may not be sufficient to mitigate the effects of industry affiliations, and further measures may be necessary to ensure that industry collaborations do not compromise the scientific evidence.

Although these editors are concerned about pharmaceutical companies, their comments apply equally to other conflicts. In the case of the systematic review, the investigators of the original trial have financial conflicts and collaborations with the spokeswoman/first author of the systematic review under discussion. She has additional conflicts associated with their co-authoring and funding of another systematic review protocol.

I believe that if Cochrane  is intent on restoring its credibility, not only do they need to clean up this mess of layered conflicts of interest, they should investigate how it came about and how it can be avoided in the future.

I’ve already written to the collaboration and I await the response.



*Our article in The BMJ that won the Bill Silverman award specifically recommended:

…That the Cochrane Collaboration reconsider its position that trial funding and trial author-industry financial ties not be included in the risk of bias assessment. The 2008 version of the Cochrane handbook listed “inappropriate influence of funders” (section (for example, data owned by industry sponsor) as a potential source of bias that review authors could optionally incorporate in the “other sources of bias” domain of the Cochrane risk of bias tool.37 The 2011 version of the handbook, however, argues that “vested interests” should not be included in the risk of bias assessment, which “should be used to assess specific aspects of methodology that might be been influenced by vested interests and which may lead directly to a risk of bias” (section As previously noted,22 empirical criteria are generally used to select items (for example, sequence generation, blinding) that are included in assessments of risk of bias,38 48 including evidence of a mechanism, direction, and likely magnitude of bias. Empirical data show that trial funding by pharmaceutical companies and trial author-industry financial ties are associated with a bias towards positive results even when controlling for other study characteristics6 8 49 50 and, thus, meet these criteria. One concern might be that including conflicts of interest from included trials in the risk of bias assessment could result in “double counting” of potential sources of bias. However, ratings in the risk of bias table are not summed to a single score, and inclusion of risk of bias from conflicts of interest could reflect mechanisms through which industry involvement can influence study outcomes6 that are not fully captured by the current domains of the risk of bias tool (random sequence generation, allocation concealment, blinding of participants and staff, blinding of outcome assessment, incomplete outcome data, selective reporting, and other sources of bias). Furthermore, even if all relevant mechanisms were to be assessed, the degree of their influence may not be fully captured when reviewers only have access to the relatively brief descriptions of trial methods that are provided in most published reports. Inclusion of conflicts of interest from included trials in the risk of bias assessment would encourage a transparent assessment of whether industry funded trials and independently conducted trials reach similar conclusions. It would also make it explicit when an entire area of research has been funded by industry and would benefit from outside scrutiny.

How to (re)create the illusion that psychotherapy extends the life of cancer patients

  • I provide a quick analysis of a story summarizing a peer-reviewed paper that did not encourage me to take a look at the paper.
  •  Life is too short, and there is just so much dubious stuff out there to devote much time pursue tracing claims that don’t pass a first screen.

smile or dieThe British Psychological Society Digest recently flagged one of its stories as the most accessed in some unknown period of time. I gave it a look. The story aroused skepticism and discouraged me from looking further at the target article which appeared in Psychology & Health. 

Oh, P., Shin, S., Ahn, H., & Kim, H. (2016). Meta-analysis of psychosocial interventions on survival time in patients with cancer Psychology & Health, 31 (4), 396-419 DOI: 10.1080/08870446.2015.1111370

P & H is not a journal where I would expect to find a high quality meta-analysis. Sorry, but I think that busy readers need to evaluate where articles are published and decide whether it is worth the trouble to access them, particularly when they are behind a paywall.

Here is my brief analysis of the BPS Digest story. I welcome readers to compare and contrast what is said in the Digest to what I have said in a previous blog post, as well asanother post discussing how well-designed negative studies get left out of reviews. The conclusions of my 2007 systematic review have not been contradicted by more recent studies.I recommend it as well. This not an area of research where new findings are expected to unseat what we conclude from older studies.

I would welcome comments about either the BPS story or the Psychology & Health article for the comparison/contrast with my blog posts and systematic review linked below.

I also invite speculation as to why the myth of psychotherapy and support groups extending the lives of cancer patients so resists being laid to rest.

A quick look at ‘Can psychosocial interventions extend the lives of cancer patients?’

Not only is the evidence for the benefits of psychosocial interventions extremely mixed, but some cancer patients and their relatives have understandably railed against the “cruel” suggestion that they might live longer if only they looked on the bright side and didn’t get so stressed.

Hmm, by emphasizing that  it is “cancer patients and their relatives” who object, does that mean we should take the objections less seriously and even ignore the many scientists who are objecting? And what of the suggestion that claims that psychotherapy extend the lives of cancer patients ultimately lead to blaming patients for succumbing to what is a debilitating and often fatal disease?

The researchers, led by P.J. Oh at Sahmyook University, found over 4,000 studies that looked promising, published between 1966 and 2014. However, once the researchers included only those papers that were randomised controlled trials and that included interventions delivered by professionals and had data on patient survival times, they were left with just 15 suitable studies conducted in five different countries and involving a total of 2940 participants with an average age of 52 years. The studies involved different types of intervention including psychoeducational programmes, CBT and supportive-expressive groups (a form of psychodynamic psychotherapy). The patients in the studies had a range of different cancers at different stages, including breast cancer, gastrointestinal cancer and melanoma.

Hmm, the problem in the small number of studies, is great heterogeneity in the interventions and patient populations suggesting that combination in meta-analysis is not appropriate because of a lack of clarity in which interventions for which populations the summary effect size would generalize we have a serious problem here. Furthermore, the authors selected many of the RCTs did not have “extension of life as a primary outcome,” and there is a strong bias in which studies follow up data was collected.

Looking at all the data from all 15 studies, there was no evidence that psychosocial interventions prolong the lives of cancer patients. However, because of the huge variation between the studies in terms of the interventions and the types of patient, the researchers also broke down the evidence into sub-categories and here the picture was more promising.

Hmm, a meta-analysis that finds no evidence for an effect on survival simply confirms past systematic reviews. Why continue post hoc, particularly when number of studies in these “subcategories” are of necessity so small?

For example, by excluding six studies that had exclusively involved patients with late-stage cancer, the researchers found that psychosocial interventions reduced the likelihood of patients dying during the course of the study (follow up times varied from one to 12 years) by 27 per cent, on average. “Stress reduction, if that is the causal mechanism, may have to occur earlier to achieve positive results,” the researchers said.

Hmm, the idea that “stress reduction”is a causal mechanism is a big leap, given this is a small post hoc analysis of a literature that previously has been evaluated in terms of no effect on survival. If there is  no effect size to explain, no explanation needed.

Other details to emerge included the finding that a positive benefit of psychosocial interventions was only apparent for studies involving patients with gastrointestinal cancer, although there was too little data to speculate as to whether this finding is meaningful or a chance result. Comparing the different types of intervention, the strongest positive evidence was for one-on-one programmes compared with groups, and for psychoeducational approaches delivered by medical doctors and nurses, as opposed to psychologists or other non-medical professionals.

Hmm, now we are really getting far out in focusing on results only for studies of patients with gastrointestinal cancer. Is there anything really here to explain that is not a post hoc rationalization? Him why would we expect results only for this particular kind of cancer?

Psychoeducational interventions involve health education, coping skills training, stress management and psychological support, and the researchers speculated their benefit might arise through a mixture of reducing patients’ distress, encouraging healthy behaviours and treatment compliance. “In addition, supportive social relationships might buffer the effects of cancer-related stress on immunity and thereby facilitate the recovery of immune mechanisms and may be important for cancer resistance,” they said.

Hmm, I think I recognize that study.

My colleagues and I thoroughly debunked its claims that did not show up in simple analysis, but emerged only on dubious multivariate analyses. The authors refused to respond to an invitation by the journal to respond to my criticism. I then requested the data from the authors to validate independently  their conclusions. The authors refused, and the funding agency, National Cancer Institute said it had no ability to enforce data sharing, but that’s a story for another blog post. I am further confident that there is no evidence that recovery of immune function through psychological intervention has ever been shown to increase survival. The claims depend on cherrypicking the most promising from a broad array of measures of immune functioning in a context where it is just as easy to obtain many as one. Particular positive results are not replicated across studies. There is no evidence linking changes in these immune measures to changes in survival. We are dealing with unvalidated claims of a surrogate outcome.

 Critics may question whether it is reasonable to combine results from such varied studies as was done in the current meta-analysis, and the researchers acknowledged that many of the studies were not as robust in their methodology as they should be. However, they end their review on a positive note: “a tentative conclusion can be reached,” they said, “that psychosocial interventions offered at early stage may provide enduring late benefits and possibly longer survival.”

Hmm, this sounds dismissive of the critics. Who are they?  What is their evidence?  Why should this review end on such a positive note, rather than giving up on a literature that has consistently generated null findings? Why doesn’t the author of this story provide the names of critics and allow the readers to evaluate the contrasting conclusion?

PLOS One allows authors of experimercial undeclared conflicts of interest, restrictions on access to data

  • While checking what PLOS One had done to address my complaints about authors’ repeated undeclared conflicts of interest, I made some troubling discoveries.
  • The PLOS One Academic Editor for one of the papers  was from Harvard Medical School, the same as the offending authors.
  • PLOS One had agreed to absurd restrictions on the availability of data,  essentially protecting these authors from independent investigation of their claims.
  • The agreement with these authors serves as a model for others’  defiant refusal to share data as a condition for publishing in PLOS One, including, obviously, the PACE investigators.
  • As a modest and symbolic protest, I am suspending activities as a PLOS One Academic Editor until a reasonable response to all of these issues is received, including my request for the PACE, which has been languishing for over 100 days.

plos one upside downBackground to the story

On October 25, 2015 I tweeted

October 25 tweet COI

PLOS picked up immediately and emailed me that they would investigate. I elaborated my concerns in a reply:

Publication of the PLOS article was coordinated with advertisements for the resiliency program. The PLOS article which is not a randomized trial, but a observational study misinterpreted in places as if it is a controlled RCT. it is effectively an experimercial [1, go to bottom of post for footnote] for the program and exploited as such in the advertisements.

My previously raising the issue about triple P parenting programs in another context elicited over 50 corrections and errata notices and changes in some journal policies. BMC medicine has offered an opportunity me to write about conflicts of interest in nonpharmacological trials. I intend to cite this paper has an example.

I believe that if reviewers and readers were prominently warned about a potential conflict of interest, they would be more vigilant to overinterpretation of an observational study with serious selection bias and would qualify their own independent evaluation of the study.

On February 5, 2016 PLOS sent me an email informing me results of their preliminary investigation, assuring me that conflict of interest statements would be attached to the articles in question and providing some preliminary wording.

On March 3, 2015, I checked the articles and found no conflict of interest statements had been attached yet, but whatI saw was truly alarming and suggested more serious problems. Hence, this blog post.

One of the articles

Gotink RA, Chu P, Busschbach JJV, Benson H, Fricchione GL, et al. (2015) Standardised Mindfulness-Based Interventions in Healthcare: An Overview of Systematic Reviews and Meta-Analyses of RCTs. PLOS ONE 10(4): e0124344. doi: 10.1371/journal.pone.0124344

“Overviews” of systematic reviews and meta-analyses are often less than the sum of their parts.They are subject to bias, particularly when their authors have products to sell. I have previously discussed this. I have been an activist for systematic reviews conducted by authors with conflicts of interest to be treated quite skeptically, particularly undisclosedconflict. This particular review concluded with a rousing and unjustified endorsement of the products that it offered to patients and other healthcare systems and providers. Big bucks are involved.

The evidence supports the use of MBSR and MBCT to alleviate symptoms, both mental and physical, in the adjunct treatment of cancer, cardiovascular disease, chronic pain, depression, anxiety disorders and in prevention in healthy adults and children.

The authors have declared that no competing interests exist, but one of the authors profits from an Institute that bears his name at Harvard.

Herbert Benson Benson-Henry Institute for Mind Body Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, United States of America

The following declaration of conflict of interest would be a modest and incomplete acknowledgment that these authors have big bucks at stake.

The following authors hold or have held positions at the Benson-Henry Institute for Mind Body Medicine at Massachusetts General Hospital, which is paid by patients and their insurers for running the SMART-3RP and related relaxation/mindfulness clinical programs, markets related products such as books, DVDs, CDs and the like, and holds a patent pending (PCT/US2012/049539 filed August 3, 2012) entitled “Quantitative Genomics of the Relaxation Response”

I went to the article at the PLOS One website check what is now corrected. I found nothing. I noticed that the Academic Editor for this article was also at Harvard. I believe he should have been excluded. I can’t believe that he did not know about the undisclosed conflict of interest. But the basic issue is that he should not have been reviewing a manuscript from his own institution in the first place.

Academic Editor: Aristidis Veves, Harvard Medical School, UNITED STATES

The second conflicted article

Stahl JE, Dossett ML, LaJoie AS, Denninger JW, Mehta DH, et al. (2015) Relaxation Response and Resiliency Training and Its Effect on Healthcare Resource Utilization. PLOS ONE 10(10): e0140212. doi: 10.1371/journal.pone.0140212

This is the article that originally aroused my attention because it is a methodologically poor study with improbable results. But publication of the article in PLOS One does an excellent job of promoting the authors’ products and services, especially when publication is coordinated with advertisements, including notifications on their websites 

The abstract states


Poor psychological and physical resilience in response to stress drives a great deal of health care utilization. Mind-body interventions can reduce stress and build resiliency. The rationale for this study is therefore to estimate the effect of mind-body interventions on healthcare utilization.


Retrospective controlled cohort observational study. Setting: Major US Academic Health Network. Sample: All patients receiving 3RP at the MGH Benson-Henry Institute from 1/12/2006 to 7/1/2014 (n = 4452), controls (n = 13149) followed for a median of 4.2 years (.85–8.4 yrs). Measurements: Utilization as measured by billable encounters/year (be/yr) stratified by encounter type: clinical, imaging, laboratory and procedural, by class of chief complaint: e.g., Cardiovascular, and by site of care delivery, e.g., Emergency Department. Subgroup analysis by propensity score matched pre-intervention utilization rate.


At one year, total utilization for the intervention group decreased by 43% [53.5 to 30.5 be/yr] (p <0.0001). Clinical encounters decreased by 41.9% [40 to 23.2 be/yr], imaging by 50.3% [11.5 to 5.7 be/yr], lab encounters by 43.5% [9.8 to 5.6], and procedures by 21.4% [2.2 to 1.7 be/yr], all p < 0.01. The intervention group’s Emergency department (ED) visits decreased from 3.6 to 1.7/year (p<0.0001) and Hospital and Urgent care visits converged with the controls. Subgroup analysis (identically matched initial utilization rates—Intervention group: high utilizing controls) showed the intervention group significantly reduced utilization relative to the control group by: 18.3% across all functional categories, 24.7% across all site categories and 25.3% across all clinical categories.

plos upside 2The abstract ends with a rousing advertisement for products and services, aimed at gaining referrals and  dissemination to other settings, from which the authors would profit..


Mind body interventions such as 3RP have the potential to substantially reduce healthcare utilization at relatively low cost and thus can serve as key components in any population health and health care delivery system.

But then I noticed the data availability statement:

Data Availability: Data from this study are available through the MGH Institute for Technology Assessment for researchers who meet the criteria for access to confidential data, such as having internal review board approval to access the data as part of their research request. Access to data from this study is subject to review as noted as it contains potentially identifiable patient information. Authors from this study may be contacted through the MGH Institute for Technology Assessment or the MGH Benson.

Give me a break! The authors are restricting access to their their data with the excuse that someone seeking it might conceivably identify one of the over 17,000 participants. These authors were granted this restriction by PLOS One. Their rationale could serve as a model for other authors with conflicts of interest trying to the strict scrutiny of exaggerated and false claims, like the PACE investigators.

I am in the upper end of the distribution of PLOS One Academic Editors in terms of the number of manuscripts that I process. I actively try to recruit new academic editors when I give talks around the world, in an effort to increase the diversity of the editorial board in terms of both geographic location and expertise. I do this for free. But I do it because I love open access, data sharing, and post publication peer review. Immediately upon publication of this blog post, I am taking a leave from my editorial responsibilities for PLOS One. There are number of editorial decisions pending. I want to get back to work, if that’s what you call you what you do for free, but PLOS One has some problems to fix including, but not limited to the refusal of the PACE authors to to provide me the data has promised. Now that we have these other issues on the table before us, I’m adding this one. I hope to hear from PLOS soon.

citizen scientists try this at home

I invite citizen scientists to get to their computers, exploit the open access of PLOS One, search, and bring other such undisclosed data sharing atrocities to the attention of the administration.

  1. I learn a lot from Barney Carroll. I borrowed his invented term experimercial from a comment left on the Retraction Watch blog

Bernard Carroll September 17, 2015 at 12:50 pm

In the New York Times yesterday, Study 329 was described as an experiment. Actually, the study has all the hallmarks of an experimercial, a cost-is-no-object exercise driven by a corporate sponsor to create positive publicity for its product in a market niche. This is not clinical science… this is product placement. And if they had to put lipstick on the pig to get what they wanted, well… Meanwhile, the continued stonewalling by AACAP is misguided.