Release the PACE trial data: My submission to the UK Tribunal


ben franklinThe UK Tribunal has denied the appeal of Queen Mary University London on behalf of the investigators of the PACE chronic fatigue syndrome trial and ordered the release of the trial data. This landmark decision is still subject to further appeal.

You can get further details of the decision in my recent blog posts here and here.

I was not a party to the appeal. My request for the PACE trial data was for what had been promised in the submission of a manuscript and publication as an article in PLOS One. Despite its subsequent portrayal by the PACE investigators, my request was not made under the Freedom of Information Act.

I was, however, approached for advice and asked to co-sign of a draft joint statement of support. I responded on  letterhead but on the joint statement ended up actually getting submitted.

Nonetheless,  Trudie Chalder cited in testimony before the Tribunal my blogging activity and social media as an example of the reputational threat posed to the PACE investigators if they were to release their data for re-analysis. The Information Commissioner countered that I represented the quality of academics seeking independent analysis of the PACE data. The Tribunal accepted this argument. When it becomes available, I will post this testimony.

My letter of support

penn seal

Hospital of The University of Pennsylvania

March 25, 2016


To the members of the First-tier Tribunal (Information Rights) who are handling case EA/2015/0269:

I am writing in support of the respondents, the Information Commissioner and Mr Alem Matthees.

Researchers, academics, scientists, journalists, patients, advocates, and others have expressed numerous reasonable concerns over how the PACE trial was conducted, analyzed, and reported. Over 12,000 individuals have signed a petition calling for an independent review of the PACE trial and for the release of anonymised individual patient data from the trial.

I believe these concerns warrant the disclosure of the requested data, to enable re-analysis of the trial results in a manner which can be easily confirmed and debated by others in the spirit of open science. In accordance with the Information Commissioner’s decision, I do not believe the requested data poses a significant risk of identification to trial participants. The public interest in disclosure also outweighs the interests of QMUL.

The outcome of this case will have significant consequences for the patient community and the wider research community.  Disclosure will help greatly to address important areas of scientific debate, resolve the controversy over the trial results, alleviate the distress of the patient community, and restore public trust in UK research. Non-disclosure would simply prolong the controversy and could cause further damage to the reputation of research in the UK, and to the “open science” movement in general. I also fear that non-disclosure will set a negative precedent.

I respectfully ask the Tribunal to consider the concerns I describe above, and reject the appeal from the appellant, Queen Mary University of London (QMUL), and to uphold the previous decision made by the Information Commissioner on 27 October 2015. Alternatively, I also respectfully call upon QMUL to withdraw their appeal.

Thank you for your consideration.


James C. Coyne, Ph.D.

Professor Emeritus of Psychology in Psychiatry


QMUL responds to UK Tribunal ordering release of PACE chronic fatigue syndrome trial data

pace trialStakeholders from around the world are reviewing the 48 page document announcing the long-awaited decision of the UK Tribunal. The decision rejects the appeal of the PACE investigators of the previous Information Commissioner’s ordering of release of the PACE data.

 A searchable, extractable PDF of the document is now available here.

I encourage you all to examine it and see how the arguments for and against data-sharing were exchanged and evaluated. Many of the arguments advanced by QMUL against routine data sharing have been previously presented by their surrogates organized by the Science Media Centre. Undoubtedly, we will see these arguments again and it’s nice to see them effectively demolished in this document. The battle for routine data-sharing is far from over.

patients say release the pace data There is much more to be said about this, but now Queen Mary University of London has issued a statement. It is reproduced below. It’s notable that it focuses on the decision having been reached in a 2:1 majority.

 The single dissenting opinion was from the lay member of the Tribunal,  who was apparently persuaded by a witness for QMUL that it just might be possible for  someone with an enormous amount of time and inordinate skills to reconstruct one or two identities of specific patients from the anonymized data.

Ross Anderson

Ross Anderson

In the majority decision, this expert witness, Ross Anderson was thoroughly demolished and criticized for his speculative arguments and conflict of interest.

The evidence of Professor Anderson that third parties could not identify participants from the information alone and that, when pressed, he said the chance of an “activist” being able to discover information that would lead to individual identification was remote. It was clear that his assessment of activist behaviour was in our view, grossly exaggerated and the only actual evidence was that an individual at a seminar had heckled Professor Chalder. The identity of those questioning the research, who signed an open letter or supported it, was impressive. While we accept that Professor Anderson was an expert witness, he was not a Tribunal appointed independent witness but appointed by the Appellant and clearly, in our view, had some self-interest, exaggerating his evidence and did not seem to us to be entirely impartial. What we got from him was a considerable amount of supposition and speculation with no actual evidence to support his assertions account the Respondents’ arguments

QMUL’s reaction:

Statement: Disclosure of PACE trial data under the Freedom of Information Act

Tuesday 16 August 2016.

As anticipated, however, QMUL is singling the likelihood that they will appeal this decision “, taking into account the interests of trial participants and the research community.”

Statement from Queen Mary University of London (QMUL):

A Tribunal has now concluded by a 2:1 majority that certain PACE trial data should be disclosed under the Freedom of Information Act.

The PACE trial was carried out according to the regulatory framework for UK clinical trials, which aims to ensure that trial participants can be confident that their information is only ever used according to their consent, and that their data is only shared under obligations of strict confidentiality.

QMUL’s appeal against the Information Commissioner argued in favour of controlled and confidential access to patient data from the PACE trial. QMUL has shared data from the PACE trial with other researchers only when there is a confidentiality agreement in place and an agreed pre-specified statistical plan for data analysis.

This has been a complex case and the Tribunal’s decision is lengthy. We are studying the decision carefully and considering our response, taking into account the interests of trial participants and the research community.



tribunal-de-cristo-trono-brancoAugust 16, 2016, 8:51 a.m EST: I’ve only just received a scan of the 48 page judgment. Because the scan is not searchable, I dictated the closing Majority decision.

[Update  8:00 pm: A searchable full PDF of the judgment is now available here.]

But to get to the punch line:


The Tribunal, by a majority, upholds the decision notice dated 27 October 2015 and dismisses the appeal.

After some brief introductory comments from the blog of Attorney Valerie Elliott Smith, I present some details from the majority decision. I will continue to review the whole document, but I think the decision represents a smashing defeat for the PACE investigators, in which their detailed arguments were turned back on them. It is also defeat for the many foes of routine data sharing, including those hastily marshaled by the Science Media Centre.

PLOS headquartersIn terms of my own quest for the PLOS One PACE data, recall that I never framed my request as falling under the Freedom of Information Act rubric. I simply asked that the data be provided to me as the PACE investigators had promised. There has been no consultation with me nor other stakeholders in the many months of the Senior Management of PLOS One’s negotiation with the PACE investigators. Maybe the Tribunal decision will now break the stalemate and encourage some transparency on the part of the senior management of PLOS One. However, I’m continuing with plans to go to Cambridge in November or early December and march on the PLOS headquarters and demand an open dialogue. I welcome you to join me and we will party outside the headquarters. Or maybe not – I hope, but I cannot expect the matter to be resolved by then.

Finally, I find particularly delicious a section of the majority decision that I reproduced below:

In any event there is a strong public interest in releasing the data given the continued academic and for so long after the research was published and seeming reluctance to Queen Mary University to engage with other academics they thought were seeking to challenge the findings (evidence of Professor Chalder).

Trudie Chalder  used the scarce time allocated to her appearance before the tribunal specifically to continue her attack on my character. She expressed alarm over the  threat my criticism posed to the reputation of the PACE investigators. Trudie, vexatious, my ass. You have been rude and unprofessional. You have consistently refused to debate, but you can look forward to a future detailed critique of your miserable, p-hacked mediational analysis.

 Attorney Valerie Elliott Smith immediately blogged with some background, to which she promises to add more later.


The First-Tier Tribunal judgment in this case has just been published. QMUL’s appeal has been roundly dismissed and therefore the Tribunal has decided that the requested data from the PACE trial should be released.

I have just skimmed the 48 pages of the judgment and so have only taken in a small amount so far. However, it appears that this is a defining moment for the international ME community and the PACE Trial. Alem Matthees (the original requestor of the data) has done an extraordinary job.

However, it is important to remember that, in theory,  QMUL could still seek leave to appeal against this judgment to the Upper Tribunal so it will be a bit longer before we can be absolutely certain that this judgment will stand.

Background note for new readers

In March 2014, Mr Matthees sought some of the data from the controversial PACE trial, using the process set out in the English Freedom of Information Act (FOIA). This information is held by relevant public authority, Queen Mary University of London (QMUL). QMUL refused to disclose the data.

In due course, Mr Matthees complained to the Information Commissioner (IC) who, in October 2015ordered that the information be disclosed. QMUL appealed against the IC’s decision; that appeal was heard by the First-Tier Tribunal on 20-22 April 2016 in central London. QMUL and the IC were legally represented and QMUL called witnesses to give evidence. Mr Matthees had been joined as a party to the proceedings. He was not legally represented and did not attend the hearing but made written submissions. Judgment is awaited.

[Note: the PACE trial, which was published in 2011, relates to certain treatments for the condition known as “chronic fatigue syndrome” (CFS). CFS is often conflated (confusingly) with myalgic encephalomyelitis (ME) and referred to as CFS/ME or ME/CFS, to the detriment of genuine ME patients. This is the situation in many countries and has been for decades; it is the cause of significant confusion and distress to many patients worldwide.

The results of the PACE trial appear to promote psychosocial treatments which many patients find either ineffective or actively harmful. As a result, some patients have been using FOIA to try to obtain the trial data in order to understand how these results were achieved. However, most requests have been denied and, five years on, most of the data is still unavailable.] 

The Tribunal decision [Please alert me to typos in this section that was hastily entred with vocie recognition software.]


The Tribunal, by a majority, upholds the decision notice dated 27 October 2015 and dismisses the appeal.

The majority decision

The Commissioner accepts that the question turns on whether anonymization is possible, and he argues that in this instance identification is an extremely remote possibility. Professor Anderson accepted that the information alone cannot identify participants, and his hypothesis that identification is possible to combine that information with NHS data (involving an NHS employee both having breachedtheir professional, legal and ethical obligations and also having the skill and inclination to do this) is implausible.

In short we accept and adopt the Commission’s wider submissions and reasoning as set out in his Skeleton Arguments and Written Closing on this issue. In all circumstances and on the evidence before us we are satisfied that the risk of a dedication has been anonymized to the extent that the risk of identification is remote. Incomes this conclusion we’ve also taken into account:

  1. The nature of the information, which did not contain any fixed or direct identifiers;
  2. The evidence of Dr. Rawle that the anonymization methodology followed the guidelines at the time and would still comply with current guidelines although they were said to be under review for the future;
  3. The evidence of Dr. Rawle that none of the identifiers were contained in the disputed information, (the anthropometry measurement issues were cleared up);
  4. The evidence of Professor Anderson that third parties could not identify participants from the information alone and that, when pressed, he said the chance of an “activist” being able to discover information that would lead to individual identification was remote. It was clear that his assessment of activist behavior was in our view, grossly exaggerated and the only actual evidence was there an individual at seminar had heckled Professor Chalder. The identity of those questioning the research, who signed an open letter or supported it, was impressive. While we accept that Professor Anderson was an expert witness, he was not a Tribunal appointed independent witness but appointed by the Appellant and clearly, in our view, had some self-interest, exaggerating his evidence and did not seem to us to be entirely impartial. What we got from him was a considerable amount of supposition and speculation with no actual evidence to support his assertions account the Respondents’ arguments;
  5. That even on Professor Anderson’s evidence for identification to take place there would have to be a breach of medical ethics and law and there was actually no evidence to quantify the risk of this occurring in the circumstances and on the facts before us in the appeal. In fact there was no tangible evidence of an example where such steps had led to a identification of the individual in any circumstances.
  6. That even on Professor Anderson’s it was only the walking scores that were likely to lead to identification (if all of his other suppositions and speculations came about);
  7. That the Fine research had been released, albeit t by accident, and there was no evidence that although it contains similar data, that there had been any individual identification or problems arising from it.
  8. We do not accept the commercial interest arguments of the College. There was very little evidence of the withdrawal of consent and where it happened it was only directly related to the issues before us. Funding had been obtained for new trial in the knowledge that the tribunal may not allow this Appeal. New confidentiality guarantees, or perhaps more explicit ones, could be given to new participants.
  9. In any event there is a strong public interest in releasing the data given the continued academic interest so long after the research was published and seeming reluctance to Queen Mary University to engage with other academics they thought were seeking to challenge the findings (evidence of Professor Chalder).
  10. There is insufficient evidence before to persuade us that the disclosure of the disputed information would cause sufficient prejudice to QMUL’s research programmes, reputation and funding streams.
  11. Professor Anderson when cross examined as to whether or not the patient identities (HESID) that would disclose to anyone with access were encrypted. He was unaware but has since checked and is now confirmed through submissions on behalf of the Appellant that this does not affect the wider point he made about care data disclosures, that they would potentially allow different health events to be linked where it could be established that they relate to the same patient identifier. This, we would have preferred to explore further in evidence but it seems to us that encryption makes the chance of identification even more remote in any event and strengthens our view that the speculative assertions of the occurrence of possible events actually taking place in a way that could lead to identification individuals, by Professor Anderson are indeed remote.
  12. We do not accept the speculation that the chanmce of a determined person with specialist skills could make the link, while less than probable, is more than remote. There’s no tangible evidence before us to persuade us that is less than remote. Professor Anderson accepted it would be extremely difficult to identify individuals even from the collective information, which as the Commissioner submits “given his approach to this would indicate the near impossibility of reidentification”. We are not persuade the risk of identification is more than remote.
  13. Generally, regarding the Commissioner’s discretion, we heard nothing, and are not persuaded on the evidence before us, that would lead us to question that the Commissioner had not apply himself correctly ,  that his decision was not properly arrived at or should be set aside.

[The majority Decision – (Mister Watson dissenting);

  1. Each role in the spreadsheet is unique and refers to one person in the trial.
  2. The information necessary to link the data to an individual is available to a large number of people due to the way in which security has been implemented and the NHS and the quantity and nature of the information is now available on social media.
  3. I believe Professor Anderson is correct when he gave evidence that the chance they determined person with specialist skills could make the link, while less than probable, is more than remote.
  4. For this reason the information contained in the spreadsheet is personal data and should not be disclosed.

Finally, at paragraph 5 c) above: “Would disclosure cause sufficient prejudice to QMUL’s research programmes, reputation and funding streams to refuse disclosure?”

We unanimously accept and adopt the Commissioner’s wider submissions and reasoning has set out in his Skeleton Arguments and Written Closing on this issue. In all the circumstances and on the evidence before us we are satisfied that the perceived risk of prejudice as submitted by the Appellant’s has not been substantiated or demonstrated evidence before us. Such minimum risk as has been expressed would not now view outweigh the public interest in disclosure of the disputed information as defined in the specific request in this appeal.

  1. The Tribunal wish to thank all parties for the helpful manner in which they have presented their arguments and submissions. We’ve been provided with an extraordinary amount of anciliary and background information on and about the important subject matter under consideration and have considered all of it. They can be no doubt about the Public Interest in the subject matter which is evident through the course of this appeal, and beyond, and we are grateful for the assistance has been given to us in this regard.

We have considered all of the above arguments, submissions and evidence together with a significant volume of supporting evidence in legal precedents and for the reasons given above we refuse the appeal by majority decision for the above reasons, and the Commissioner’s DN stands.

Brian QC 11th August 2016

Promulgated 12th  August 2016



CBT versus psychodynamic therapy for depression: One sentence changes the whole story

A recent comparative effectiveness study in JAMA Psychiatry of CBT versus psychodynamic psychotherapy for depression was billed as a noninferiority trial.

booby prizeOne sentence in the results section changed the whole significance of the study.

The dodo bird verdict for the study is that everybody gets a booby prize.

The study is currently freely accessed at JAMA Psychiatry, although you may need to register for free to actually download the PDF.


Connolly Gibbons M, Gallop R, Thompson D, et al. Comparative Effectiveness of Cognitive Therapy and Dynamic Psychotherapy for Major Depressive Disorder in a Community Mental Health Setting: A Randomized Clinical Noninferiority Trial. JAMA Psychiatry. Published online August 03, 2016. doi:10.1001/jamapsychiatry.2016.1720.

The moderately sized study compared to active treatments without a nonspecific comparison/control group.

Results.  Among the 237 patients (59 men [24.9%]; 178 women [75.1%]; mean [SD] age, 36.2 [12.1] years) treated by 20 therapists (19 women and 1 man; mean [SD] age, 40.0 [14.6] years), 118 were randomized to DT and 119 to CT. A mean (SD) difference between treatments was found in the change on the Hamilton Rating Scale for Depression of 0.86 (7.73) scale points (95% CI, −0.70 to 2.42; Cohen d, 0.11), indicating that DT was statistically not inferior to CT. A statistically significant main effect was found for time (F1,198 = 75.92; P  = .001). No statistically significant differences were found between treatments on patient ratings of treatment credibility. Dynamic psychotherapy and CT were discriminated from each other on competence in supportive techniques (t120 = 2.48; P = .02), competence in expressive techniques (t120 = 4.78; P = .001), adherence to CT techniques (t115 = −7.07; P = .001), and competence in CT (t115 = −7.07; P = .001).

Conclusions and Relevance.  This study suggests that DT is not inferior to CT on change in depression for the treatment of MDD in a community mental health setting. The 95% CI suggests that the effects of DT are equivalent to those of CT.

In case there is any ambiguity in the message the authors wanted to convey, they reiterated:

Key Points

  • Question Is short-term dynamic psychotherapy not inferior to cognitive therapy in the treatment of major depressive disorder (MDD) in the community mental health setting?

  • Findings In this randomized noninferiority trial that included 237 adults, short-term dynamic psychotherapy was statistically significantly noninferior to cognitive therapy in decreasing depressive symptoms among patients receiving services for MDD in the community mental health setting.

  • Meaning Short-term dynamic psychotherapy and cognitive therapy may be effective in treating MDD in the community.

Despite an accompanying editorial, the study only got a moderate amount of immediate attention in the social media. Here are the altmetrics.views

altmetrics PNG

parroting I examined the 40 tweets available on August 6, 2016 and found only one that went beyond parroting.

good tweets I I suspect that Robert Howard had discovered the one sentence in the results section that I noticed:


Nineteen patients (16.1%) in DT and 26 patients (21.8%) in the CT condition demonstrated response to treatment as measured by a 50% reduction on the HAM-D score across treatment (χ21 = 1.27; P = .32).

Most of the patients assigned to either group in this study failed to respond to treatment. Tipped off by this sentence, I looked for the degree of treatment exposure and found that most patients did not get exposed to sufficient intensity of treatment.

Sixty-three patients (26.6%) attended 1 or fewer sessions of psychotherapy; 122 (51.5%), 5 or fewer sessions; and 187 (78.9%), 11 or fewer sessions. We found no statistically significant difference between treatments in the number of sessions attended (t235 = 1.47; P = .14).

 The title of the JAMA Psychiatry article noted that patients had been recruited from the community mental health center. I interpret this to suggest they were likely to be a low income group who were not previously prepared for psychotherapy.

Before anyone proposes that the solution is simply to offer more therapy, note that the patients were not attending enough sessions of a larger number (16) that were offered. My interpretation is that greater effort may be needed to get such patients to consistently show up for sessions.

My colleagues and I previously conducted an exceptionally well resourced study in in the same low income and socially disadvantaged Philadelphia population. Our intention was to reduce risk factors among recently pregnant, low income women for another low weight birth delivery. We demonstrated that we could recruit and retain these women, but it took an intensive, creative effort.

One of the risk factors that we addressed was depression and we offered antidepressant medication and free treatment at the world-renowned University of Pennsylvania Center For Cognitive Therapy. We provided free transportation and child care. Few women access sufficient therapy or receive sufficient dose of antidepressants. The therapists at the center complained that the women did not seem to have their life in order and did not seem ready for psychotherapy. Personally, I think that the therapist may not have been ready for such women and did not sufficiently engage them.

Back to the study under discussion, it was accompanied by an editorial that parroted the authors’ intended message in its title:

Abbass AA, Town JM. Bona Fide Psychotherapy Models Are Equally Effective for Major Depressive Disorder: Future Research Directions. JAMA Psychiatry. Published online August 03, 2016. doi:10.1001/jamapsychiatry.2016.1916.

But I noticed this in the text:


Among other points, the study by Connolly Gibbons and colleagues raises the ongoing challenge facing all psychiatrists using pharmacotherapy and psychotherapy: how to improve rates of remission in real-world clinical samples. The study found that more than 80% of all participants did not respond to treatment (22% of patients receiving CBT and 16% of patients receiving STPP had response to treatment as measured by a 50% reduction in observer-rated depression). This high rate of nonresponse may be partly explained by inadequate treatment “dose” or number of sessions, clinical sample, therapist expertise, biomedical factors, and sociofamilial factors impeding outcomes

The JAMA Psychiatry article under discussion cited another, similar study conducted in the Netherlands, but did not elaborate on its findings:

Driessen E, Van HL, Don FJ, Peen J, Kool S, Westra D, Hendriksen M, Schoevers RA, Cuijpers P, Twisk JW, Dekker JJ. The efficacy of cognitive-behavioral therapy and psychodynamic therapy in the outpatient treatment of major depression: a randomized clinical trial. American Journal of Psychiatry. 2013 Sep 1.

Unlike the JAMA Psychiatry article, the abstract of the Dutch study qualified its finding of non-inferiority by noting that nether therapy did particularly well:


No statistically significant treatment differences were found for any of the outcome measures. The average posttreatment remission rate was 22.7%. Noninferiority was shown for posttreatment HAM-D and patient-rated depression scores but could not be demonstrated for posttreatment remission rates or any of the follow-up measures.


The findings extend the evidence base of psychodynamic therapy for depression but also indicate that time-limited treatment is insufficient for a substantial number of patients encountered in psychiatric outpatient clinics.

dodo bird verdictI suspect that both of these randomized trials will be cited as evidence of the Dodo Bird Verdict for psychotherapy for depression – everybody’s a winner and everybody gets a prize. However, in both the studies, the cognitive behavior therapy underperformed relative to the efficacy demonstrated in a larger body of studies. The literature for psychodynamic therapy is more limited and of low quality.

Still, I think the messages that when you move into more difficult populations, you can’t expect results obtained with more carefully selected, therapy-ready patient populations who were recruited to more typical studies. But this may reflect on the unrepresentativeness of patients in the larger literature.

Meanwhile, Psychiatrist Erick Turner and I have been having an exchange on Twitter concerning another noninferiority study.

Turner Tweet.PNG

Erick is referring to a perspective he shares with things I’ve been saying regularly about noninferiority trials. They typically don’t include a nonspecific comparison/control group. Without such a group, we can’t evaluate whether either of the active treatments are better than provision of nonspecific treatments with elements of support, positive expectation, and attention.

That is also a limitation of the current study, but by peeking into the actual results, we discover referral to neither of two active treatments left most patients free of depression.

What if there had been a credible attention/support condition in the present study? Would either of these two treatments that were “noninferior” to each other have a clinically significant advantage? What would be the implications, if not? would the report have made it into JAMA Psychiatry?

Too much ado about church attendance and suicide rates among women

 VanderWeele TJ, Li S, Tsai AC, Kawachi I. Association Between Religious Service Attendance and Lower Suicide Rates Among US Women. JAMA Psychiatry. 2016;73(8):845-851. doi:10.1001/jamapsychiatry.2016.1243.

praying inThis recent study in JAMA Psychiatry received an extraordinary amount of attention in a short time, undoubtedly orchestrated by the journal. Prestigious journals interested in keeping their prestige precisely track how much immediate attention papers get using altmetrics.

The intent is to boost immediate attention, which increases early citations. In turn, early citations raise impact factors. Journal impact factors are calculated based on the number of citations the papers received two years after publication. Journals can advertise higher impact factors which boost subscriptions and paid advertisements. Here are the altmetrics for this article. They suggest it is an outlier in terms of the amount of attention it got in a short time.



Among the numerous sources of attention were religious oriented media. Catholic Online proclaimed How the Church can improve your life. But secular medical sources simply proclaimed Take Me To Church: Attending Religous Services Linked To Lower Suicide Rates Among Women.

An article in the UK Spectator, People who go to church live longer. Here’s why was written by the first author of the JAMA Psychiatry paper, but I had to compare authorship of the articles to discover this.

During such publicity campaigns, journals often temporarily provide free access to what would otherwise be pay walled articles in order to stimulate attention. Unfortunately, that wasn’t the case with this article, and so readers without access to a subscription or a university library could only check the claims against an abstract, not the full paper. Here’s an excerpt:

Design, Setting, and Participants  We evaluated associations between religious service attendance and suicide from 1996 through June 2010 in a large, long-term prospective cohort, the Nurses’ Health Study, in an analysis that included 89 708 women. Religious service attendance was self-reported in 1992 and 1996. Data analysis was conducted from 1996 through 2010.

Results. Among 89 708 women aged 30 to 55 years who participated in the Nurses’ Health Study, attendance at religious services once per week or more was associated with an approximately 5-fold lower rate of suicide compared with never attending religious services (hazard ratio, 0.16; 95% CI, 0.06-0.46). Service attendance once or more per week vs less frequent attendance was associated with a hazard ratio of 0.05 (95% CI, 0.006-0.48) for Catholics but only 0.34 (95% CI, 0.10-1.10) for Protestants (P = .05 for heterogeneity). Results were robust in sensitivity analysis and to exclusions of persons who were previously depressed or had a history of cancer or cardiovascular disease. There was evidence that social integration, depressive symptoms, and alcohol consumption partially mediated the association among those occasionally attending services, but not for those attending frequently.

An accompanying editorial

Koenig HG. Association of religious involvement and suicide. JAMA Psychiatry. 2016 Jun 29.

had free access.  It extolled the virtues of the study:

The study by VanderWeele et al is important because of the large sample, lengthy follow-up period, and rigorous statistical methods used to analyze the data, including adjustments for baseline religious service attendance and removal of women who were previously depressed or with major physical health problems. Adjustment for baseline religious attendance and initial removal of women with depression or physical illness is particularly important to avoid the problem of reverse causation, an issue in studies of religious service attendance and other health outcomes. Depressed persons at greatest risk for suicide are often socially withdrawn and less likely to attend religious services, which could otherwise explain the association.

The editorial suggested immediate clinical applications of the findings, but ended on a discreet note of caution that would set off alarm bells for a skeptic.

 What should mental health professionals do with this information? Evaluating patients’ moral beliefs about suicide and level of involvement in religious community may help clinicians gauge risk of suicide. Thus, the findings by VanderWeele et al underscore the importance of obtaining a spiritual history as part of the overall psychiatric evaluation, which may identify patients who at one time were active in a faith community but have stopped for various reasons. Exploring what those reasons were, particularly among the socially isolated, and perhaps supporting a return to such activity, if the patient desires, may help produce social connections that lower suicide risk. If based exclusively on the findings of VanderWeele et al, these suggestions would apply only to white female nurses in the United States. Given the large amount of research in other ethnic, sex, socioeconomic, professional, and nonprofessional groups that shows similar associations, one might be tempted to apply these findings to other populations as well. Nevertheless, until others have replicated the findings reported here in studies with higher event rates (ie, greater than 36 suicides), it would be wise to proceed cautiously and sensitively.

This last sentence provides crucial information that should have been reported in the abstract of the article. The seemingly impressive study involved predicting only 36 suicides. Any multivariate analyses spread these 36 suicides across little boxes of categorical variables. Imprecision in the measurement of any of these variables or any misclassification could produce very different results.

As a skeptic accustomed to hard-sell efforts based on weak data, this is the information I would have immediately sought. Armed with it, I would’ve been prepared to reject as nonsensical the pseudo-precision of a dramatic estimates of effacts that are contained in the abstract..

 …Attendance at religious services once per week or more was associated with an approximately 5-fold lower rate of suicide compared with never attending religious services (hazard ratio, 0.16; 95% CI, 0.06-0.46).

Come on! This “five-fold” difference refers to differences in the distribution of fewer than 36 suicides in a couple of boxes. It would have been less striking if it were reported in absolute terms. The boxes are “no attendance,” versus “attendance once a week or more.” That means that some of the already small number of 36 suicides were thrown out because they did not occur among women who attended church but less consistently than once a week or more.

Dropping subjects in the middle of a distribution and focusing on extremes inflates the significance of results. There is no evidence that these authors made the decision to do so without first peeking at their data.

Seemingly impressive multivariate analyses that build on these analyses just add further nonsense and noise.

Women who have the regularity of routine to show up in church at least once a week or more are often regular in other ways they can affect the health and well-being.

The first is called “healthy user bias.” As Gary Taubes described nicely, “people who faithfully engage in activities that are good for them — taking a drug as prescribed, for instance, or eating what they believe is a healthy diet — are fundamentally different from those who don’t.

Women who regularly go to church take better care of themselves in still other ways.

 Next, there is another subtle component of healthy-user (or “healthy continuer”) bias. This is the “compliance or adherer effect or bias”. Individuals who comply or adhere with their doctors’ orders when given a prescription are different and healthier than people who don’t.

This Nurses Health cohort is well studied. In lots of papers. Its many biases  have been pointed out. To start with, it’s limited to nurses. Extremes of social deprivation are excluded. And the rates of suicide in the Nurses Health Study are substantially less than in the general population of women

I don’t have much confidence in the claims being made in the press coverage or the article itself.But my skepticism turned to disappointment in the authors. I noticed that they did that did not adequately acknowledge results concerning suicide from the same Nurses Health Study set that were published less than a year ago in the same journal.

Tsai AC, Lucas M, Kawachi I. Association between social integration and suicide among women in the United States. JAMA Psychiatry. 2015 Oct 1;72(10):987-93.

Unlike the present study, this one is available as a PDF without a pay wall. For reasons that are explained, it has 43 deaths by suicide to explain and suggests that social integration is a reasonably good predictor of death by suicide. This is a contradiction of what is said in the current paper.

Who knows why. But dropping a few death by suicide can make a substantial difference when there are so few to begin with.  Then there is always the voodoo of applying multivariate statistics in predicting so few infrequent events. I think this further demonstrates the problems of making a big fuss when talking about so few suicides. Add or subtract only a few suicides and  you get a whole different story to tell.

I hope you have  learned some things from this exercise:

  1. Whenever you see reports of epidemiological studies of suicide, keep in mind the infrequency of suicide. Pay attention to the number of death by suicide to explain, not the size of the overall sample.
  2. Correlation does not equal causality, particularly when the correlations occur with multivariate statistical analyses with unknown peeking at the data ahead of time.
  3. Beware of clinical implications being drawn from such weak data.

Genetic denialism and claims about locked wards and suicide from non-RCT designs

Some things I’m saying in social media in the week ending July 31, 2016 and will probably follow up on.

This post is a bit of an experiment. I share some of things that I’m talking about on Facebook and Twitter that have a high likelihood that they will be reflected in later blog posts. Comments welcomed, as well as encouragement or discouragement about pursuing further in later blog posts.

 On Facebook: Genetic Denialism

I had  recently blogged about a US neurologist writing in The Conversation that clinicians should consider the possibility that migraine headaches are due to patients having been abused as children. The neurologist arrived at this recommendation with very superficial attention to the available evidence.I would’ve thought that tickling with a complex and poorly understood condition like migraine headaches, the neurologist would recommend exploring comorbidities and idiosyncratic reactions to medications, rather than making a psychological issue of something that is not obviously psychosomatic.

A focus on childhood adversity and a rejection of genetic influences and migraines reflects a recurring theme in a much larger literature. For a while, child adversity has been being pitted against hereditary in discussions where increasingly sophisticated research about genes and genetic expression are given short shrift. Oliver James is a key example of someone who considers the choice between early adversity and genetic influences settled on the basis of politics.

I posted a link to this article on Facebook:

The cruel grandmother, shamed mother and psychotic half-brother who shaped David Bowie’s life and work.

I said

More nonsense from the generic denialist Oliver James. He has a lot in common with Richard Bentall  and this piece could’ve been written by Bentall. James basically takes considerable evidence of familial transmission of psychosis in the family of David Bowie and uses it to deny a role of genetics.

David Bowie Passes away at 69

David Bowie photographed by Chris Walter in 1974. © RTNWalter / MediaPunch/IPX

It’s worth pondering this piece, not only because it’s wild inferences about influences on David Bowie’s music, but because the article reveals the form of some thinking of generic denialism takes. When genetic influences are expressed through adverse family circumstances, genetic denialists like Oliver James take them as evidence that genetic influences do not exist.

beans_on_toast430x300Behind all this is a lot of entrenched politics, not science. There used to be a group in Manchester, UK, Psychologists against Nazi Psychiatrists. While it appears to have dispersed, a number of members remain very influential at Liverpool University. Some of the faculty there like David Pilgrim still attack mental professionals as Nazis who are willing to entertain the idea that that there are genetic influences on abnormal behavior [see Pilgrim’s bizarre views in the comment section here]. British Psychological Society President Peter Kinderman (along with Bentall, also at Liverpool) drew heavily on Pilgrim’s comments without attribution and took the “Think child adversity, not genetics or else you are a Nazi” to a whole new level. I know, you have to see this bias before believing it is so strong

I also commented on Facebook:

I think that a lot of childhood abuse reflects environmental transmission of parental vulnerability to severe mental disorder. It’s not simply environmental or genetic, but an interaction between genes and environment.

Stay tuned

As I keep saying, there are so many dubious papers to critique, but so little time to blog. But I’ll likely be following up on the issue of childhood adversity. Likely prospects are a consideration of some new claims in the literature that the influence of childhood adversity on the brain has been traced traced. Another prospect is been on the back burner too long into consideration of the role of childhood adversity versus genes of development of psychosis and schizophrenia. This is extremely politicized literature were too much is being made of mediocre data. And what everybody seems to be ignoring is that even with the mediocre, highly confounded data that are available, the risk for psychosis associated with childhood adversity is about the same as smoking tobacco or marijuana.

On Twitter

There’s been ongoing discussion going about an article in Lancet Psychiatry article.

Huber CG et al. Suicide risk and absconding in psychiatric hospitals with and without open door policies: a 15 year, observational study The Lancet Psychiatry, 380 (2016) . doi:10.1016/S2215-0366(16)30168-7 1

With tweets such as Suicide Risk No Different on Closed and Open Psychiatric Wards

James C.Coyne@CoyneoftheRealm tweeted

Headline ignores that suicidal persons are not randomly assigned to locked and unlocked wards. Stop distorting.

 Not sure what your point is, but evidence for means restriction in reducing suicidality is quite strong.

Acutely suicidal persons left in unlocked wards is stupidly irresponsible. Low quality data don’t convince otherwise.

Unfortunately, the article is behind a pay wall and Lancet Psychiatry is still not listed in PubMed, restricting access. But I do have access to the abstract and the confidence intervals for death by suicide are huge, (OR 1·326, 95% CI 0·803–2·113; p=0·24).

So why am I offering an opinion on an article that I haven’t yet seen?

A fair question, to which I respond – I’m taking the risk of relying on prior probabilities.

I have studied suicide in Germany, where the study was conducted, including some intervention studies. The low frequency of deaths by suicide means it’s difficult to show anything influences suicide. This study is observational, not an RCT. Everybody tweeting about it seems to assume that propensity scores can overcome the problems of not having RCT data. Propensity scores are attractive, but poorly understood by most people who use them. They are vulnerable to on measured confounds. They require dropping data for patients for whom a match can be found, in this case patients hospitalized in locked versus unlocked wards.

I definitely will be blogging about this, after putting out a request for the article on Twitter.

 Other things I’m saying on Twitter James C.Coyne@CoyneoftheRealm

#2016APAPREZ #APA2016 Will you apologize to innocent victims of psychologists’ involvement in #torture?

We need to recognize that communicating about evidence impossible in some conversations – or terribly inefficient.

Remarkable how academic institutions remain silent about alleged scientific misconduct by some of their researchers. …

Imagine testing yoga vs acupuncture (AC) in cancer pts receiving chemo, no control group, concluding yoga not < AC.…

US law allows 72 hour hold. Can only be extended if qualified professional observes active risk to self or others.

Szasz had libertarian views about not locking up acutely suicidal people. Only treated neurotic, nonsuicidal patients in long-term therapy

Suicide ideation poor surrogate outcome for services research. Too nonspecfic, easily changed w/o outcomes affected.


A few words about plagiarism and how to avoid it

tabitha powledgeThis blog post is excerpted from one at PLOS’ On Science Blogs. it is part of a longer post by Tabitha M. Powledge with the title of Plagiarism, norovirus, and Obamacare at the Republican convention. I was concerned that what was specifically being said about plagiarism might be lost for readers not drawn to discussions of norovirus or Obama care at the Republican National Convention. I think the points about plagiarism that are made need a broad dissemination and discussion.

Note the interesting quote that the idea that appropriating the words and work of others is sinful is a recent American invention.

Note also the strategy for avoiding unintentional plagiarism.

Tabitha Powledge is an award-winning long-time science journalist, book author, and media critic. On Science Blogs is her weekly look at the craft and content of science blogging.

Gizmodo’s George Dvorsky helpfully relayed scientific proof of the odds that particular words and phrases in aspiring First Lady Melania Trump’s speech at the Republican National Convention Tuesday appeared by accident in the same order as they did in Michelle Obama’s speech at the Democrats’ National Convention in 2008. Those odds are about 87 billion to 1.

That the source of the crib is actual First Lady Michelle Obama is so deliciously–I was going to say ironic, but it is truly beyond ironic. If you put it in a novel, no reader would believe it. The 87 billion-to-1 calculation came from Canadian physicist and astronomer Robert Rutledge. He says that’s about the same number of stars as in our galaxy, the Milky Way. Dvorsky’s post explains how  Rutledge figured the odds.

But according to Walt Hickey at FiveThirtyEight, the odds are much, much higher. He reports the company that makes the plagiarism detection program Turnitin calculates the “approximate probability that a 16-word phrase in one speech would coincidentally match a phrase of the same length in another speech” is 1 in 1 trillion.

Politicians analyzed the plagiarism numerically too. Aspiring US Attorney General Chris Christie pointed out that the disputed passages amounted to only about 7% of Ms. Trump’s address.

Christie’s 7% solution for belittling the offense was at least a more upfront acceptance of the obvious than the Trump campaign’s initial response. Minions spent a day spouting an increasingly fantastic string of denials. (Katie Reilly provided a chronological list at Time’s Swampland.) This tactic was ultimately unsuccessful, but fully consistent with the campaign’s overall strategy of trying to sell voters a cornucopia of alternate realities.

It took a while, but the campaign did eventually acknowledge Michelle Obama’s authorship. It produced longtime Trump staffer Meredith McIver, who fell loyally on her sword and took the blame for cribbing.  She said she offered to resign but was told that anybody can make a mistake. Her resignation was generously declined.

My favorite analysis of this ploy comes from the journalism éminence grise Roy Peter Clark, who observed at Poynter: “It is possible that McIver is the sacrificial lamb, that she played no real part in the scandal, and that her reward is a continuation of her long service to the Trump family.”

Clark offers a number of strategies for avoiding inadvertent plagiarism due to sloppy note-keeping. One is paper-based, using two different conventional notebooks, but it could be adapted easily to digital notes.

He uses a black notebook for his own thoughts and ideas, a green one for his sources. That would be simple enough to do on electronic devices using two different note-taking programs. There are plenty of good ones to choose from, both plain-text and those that can capture web pages.

I suppose it could be doable in a single program or app too, one that was flexible enough to permit distinguishing between the two kinds of information (for example by color-coding.) But with the one-program approach, opportunities for screwing up would be plentiful. Using different software would be safer. As long as you remembered which was which.

The Melania/Michelle event spawned posts arguing that copying the work of others is no big deal in many countries, and sometimes even encouraged. The idea that appropriating the words and work of others is sinful is a recent and specifically American invention, according to the aptly named English prof Karen Swallow Prior. At Vox, she traces it to the US Copyright Act of 1790.

Before the speech, Ms. Trump had told NBC News that she wrote it mostly herself. “If we take her at her word, then it is helpful to look at the post-communist educational system that Melania experienced growing up in Slovenia. In that system, what is typically considered plagiarism or cheating was exceedingly common and even encouraged,” Monika Nalepa tells us at the Monkey Cage.

Alan Levinovitz, a professor of religion, writes at Vox that when he taught in China, his students were surprised by his strict rules against copying. International students are responsible for many (but by no means all) cases of plagiarism in his current classes in Virginia, he says.

Plagiarism is a huge topic, nowhere more than in science. Instances of copying language–and, worse, work–are described almost daily at the invaluable blog Retraction Watch. The blog’s founders, Adam Marcus and Ivan Oransky, enjoy themselves at Lab Times enumerating euphemisms that journals have invented to avoid using the word plagiarism in retraction notices. Some examples: “unattributed overlap,” “administrative error,” “significant originality issue”. A favorite: “Some sentences…are directly taken from other papers, which could be viewed as a form of plagiarism.” Marcus and Oransky remark, “We await word on what else it could be viewed as.”

Accessibility of psychology papers: Will you sign the More Open Access Pledge?

open accessIn an ideal world, our knowledge would be of high quality and it would be accessible to all.

This guest blogpost is about an initiative to speed up accessibility: the More Open Access Pledge. I’m hoping that you will sign up and encourage others to do so too.

Introducing Eva Alisic

evaOur guest blogger is a senior research fellow at Monash University, Australia, where she leads the Trauma Recovery Lab, and a visiting scholar at the University Children’s Hospital Zurich, Switzerland. Her team studies how children, young people, and families cope with traumatic experiences, and how professionals can support them. Parts of this blogpost have been published on the Trauma Recovery blog.  

Therapists cannot access therapy literature

We recently examined how open the literature on Posttraumatic Stress Disorder is. Unfortunately not very: 58% of the publications were behind a paywall.

It is worrying that practicing psychologists cannot access the latest research on therapy effectiveness. Or on how to deal with dropout from interventions. Or on clients’ perspectives.

The migration crisis and refugees are on my mind a lot these days. How can we justify that relevant knowledge is unavailable to support those in need?

Not only practitioners have little access to the latest evidence. The same applies to many scholars in low-resource settings, policy makers, and citizens in general. Much research is behind a paywall, even though it was funded with public money. This system is lucrative for the publishers of certain ‘traditional journals’, which charge extra-ordinary amounts of subscription money to university libraries.

Getting radical

I have decided to be radical about it: since last month, I submit my first-authored research papers to Open Access outlets only. I’m moving to reviewing exclusively for Open Access journals, reflecting on my citation practices and exploring my Open Data possibilities.

With Open Access outlets, I do not mean traditional journals that offer authors the option to make a single article available to everyone. That means paying the same publisher twice and does not change the system. I also do not mean falling into the hands of predatory publishers (see e.g. Beall’s list and ThinkCheckSubmit).

Several quality Open Access journals accept psychology articles. Examples are PLOS ONE, PLOS Medicine, PeerJ, and, in my specific field, the European Journal of Psychotraumatology. Some of them charge authors high amounts of money though, creating further inequality between researchers from high- and low-income settings. There are also interesting developments around pre-print platforms and overlay journals.

Many see my move as ‘career suicide’, as I will not submit my papers to top journals. Even colleagues with a strong commitment to Open Access feel they cannot take the risk.

It says a lot about how much we focus on reputation of journals in contrast to quality and accessibility of knowledge itself. And it reinforces why radical stances like mine are necessary to change the system.

Nevertheless, making a smaller move towards open access is much better than making no move at all. If many people do this, it will make a difference. That is why, together with members of the Global Young Academy, we recently launched the More Open Access pledge.

The More Open Access Pledge

The Global Young Academy is a 200-strong worldwide organization of early- and mid-career researchers. They are passionate about science communication, science advice, and science education. Open Science is a key interest, which led to statements on Open Data and Open Acces

Last week, 132 members and alumni pledged to submit at least 1 manuscript to an Open Access outlet in the remainder of 2016. The outlet can be either an Open Access journal or a well-recognized platform (e.g. ArXiv for physics), as long as the manuscript is peer-reviewed and shared without an embargo period.

The goal of the pledge to accelerate the move towards Open Access in a way that is feasible for most researchers, irrespective of discipline, seniority, or resources.

As one signatory commented: “It makes so much more sense to start with a low threshold self-commitment pledge rather than ‘I will publish most of my articles in OA’ and other manifestos out there that are unpractical for the majority of researchers.”

You can join the pledge

We hope that many people will join the pledge for More Open Access.

You can read more and sign up right away. And we hope that you will start conversations with colleagues and encourage them to get involved too.


A bad response to the crisis of untrustworthiness in clinical psychological science

surpriseA new Call for Papers establishes a place for failed replications and null findings in clinical psychology in an American Psychological Association journal. Unfortunately, the journal lacks an impact factor, despite the journal having been publishing for decades.

There are lots of reasons that establishing such a ghetto where failed replications and null findings can be herded and ignored is a bad idea. I provide nine. I’m sure there are more.

But the critical issue in the creation such ghettos is that they reduce pressure on the APA vanity journal,  Journal of Consulting and Clinical Psychology to reform questionable publication practices and routinely accept replications and null findings.

 Clinical psychology is different

  • The untrustworthiness in clinical psychological science is serious, but different than that of personality and social psychology, and the crisis it poses requires different solutions.
  • There is little harm to not been able to replicate personality and social psychology studies, beyond to the credibility of those fields and the investigators within them.
  • However, untrustworthy findings in clinical psychology – whether they are exaggerated or simply false – can translate into ineffective and even harmful services being delivered, along with poor commitment of scarce resources to where they are needed less.
  • Personality and social psychologists can look to organized mass replication efforts to assess the reproducibility of findings in their fields. However, such efforts are best undertaken with Internet-recruited and student samples using surveys and simple tasks.
  • Mass replication efforts are less suitable for key areas of clinical psychology research, which often depends on expensive clinical trials with patients and extended follow-up. Of course, research and clinical psychology benefits from independent replication, but it is unlikely to occur on a mass basis.

Efforts to improve the trustworthiness of clinical psychology should have progressed more, but they have not.

Clinical psychology has greater contact than personality and social psychology with the biomedical literature, where untrustworthy findings can have more serious implications for health and mortality.

In response to repeated demonstrations of untrustworthy findings, medical journals have mandated reforms such as preregistration, CONSORT checklists for reporting, transparency of methods and results using supplements, declarations of conflicts of interest, and requirements for the routine sharing of data.  Implementation of these reforms in medical journals is incomplete and enforcement is inconsistent, with clear signs of resistance from some prestigious journals. Note for instance, the editor of the New England Journal of Medicine warning that routine sharing of data from clinical trials would produce “research parasites” who would put the data to different purposes than intended by the original authors.

While many of these reforms have been nominally endorsed by specialty clinical psychology journals, they are largely ignored in the review and acceptance of manuscripts. For instance, a recent systematic review published in JCCP  of randomized trials published in the most prestigious clinical psychology journals in 2013 identified 165 RCTs. Of them,

  • 73 (44%) RCTs were registered.
  • 25 (15%) were registered prospectively.
  • Of registered RCTs, only 42 (58%) indicated registration status in the publication.
  • Only 2 (1% of all trials) were registered prospectively and defined primary outcomes completely.

Apparently not only are investigators failing to register their trials, editors and reviewers ignore whether registration has occurred and don’t bother to check whether what is reported in a manuscript is inconsistent with what is proposed in a registration.

Questionable research practices in clinical psychology

The crisis in clinical psychological science lies in its evidence base:

  • RCTs are underpowered, yet consistently obtain positive results by redefining the primary outcomes after results are known.
  • Typical RCTs are small, methodologically flawed study conducted by investigators with strong allegiances to one of the treatments being evaluated.
  • Treatment preferred by investigators are a better predictor of the outcome of RCTs than the specific treatment being evaluated.

Questionable publication practices in clinical psychology

Questionable research practices (QRPs) in clinical psychology are maintained and amplified by questionable publication practices (QPPs).

The premier psychology journal for publishing randomized trials is Journal of Consulting and Clinical Psychology. It is a vanity journal with a strong confirmation bias and a distinct aversion to publishing null findings and replications. Until recently, letters to the editor were not even allowed. When the ban was relaxed a few years ago, a high bar was set for accepting them. Statistics about the rate of acceptance of letters to the editor are not available, but accounts from colleagues suggest that criticisms of basic flaws in articles that have been published are suppressed. JCCP is not a journal hospitable to post-publication peer review.

Publication of flawed studies in JCCP go on detected and unannounced, except through alternative post publication peer review, outside the journal, such as PubMed Commons comments and blogging.

Although the term “Pink Floyd rejection” was originally developed by an outgoing editor of the Association for Psychological Science’s Psychological Science, it captures well the editorial practices of JCCP.

pink floyd study -page-0

Call for Brief Reports: Null Results and Failures to Replicate

An APA press release announced:

Journal of Psychotherapy Integration will start publishing a new recurring brief reports section titled, “Surprise, Surprise: Interesting Null Results and Failures to Replicate.”

In an era when findings from psychological science are called into question, it is especially important to publish carefully constructed studies that yield surprising null results and/or failures at replicating “known” effects.

The following 2012 article published in Journal of Psychotherapy Integration is a good example of a paper that would be appropriate for this section:

DeGeorge, J., & Constantino, M. (2012). Perceptions of analogue therapist empathy as a function of salient experience. Journal of Psychotherapy Integration, 22, 52-59.

Submitted manuscripts should not exceed 2500 words, including references. Manuscript should be submitted electronically through the journal’s submission portal under Instructions to Authors.

Please note in your cover letter that you are submitting for this brief reports section. We look forward to your submissions!

What’s wrong with this resting place for failures to replicate and null findings?

  1. Authors undertaking replications, regardless whether they succeed in confirming past findings, are entitled to a journal with an impact factor.
  2. The title Journal of Psychotherapy Integration adds nothing to electronic bibliographic searches because “psychotherapy integration” is not what failures to replicate and null findings necessarily represent. Locating particular articles in electronic bibliographic searches is often fortuitous. Readers’ decisions to click on a title   to examine the abstract  depend on their recognizing the relevance of the article from the title of the journal in which it is published.
  3. The title to this special section is demeaning. If it is a joke, it will soon wear thin.
  4. Failures to replicate and null findings are not necessarily “surprises” given the untrustworthiness of the clinical psychology literature.
  5. Reasons for the failure to replicate previously published clinical trials often lie in the conduct and reporting of the original studies themselves. Yet having been granted “peer-reviewed” status in a more prestigious journal, the original articles are automatically granted more credibility than the failure to replicate them.
  6. A word limit of 2500 is hardly adequate to describe methods and results, yet there is no provision for web-based supplements to present further details. The value in failures to replicate and null findings lies in part in the ability to make sense of the apparent discrepancy with past studies. Confining such papers to 2500 words reduces the likelihood that the discussion will be meaningful.
  7. The existence of such a ghetto to which these papers can be herded takes pressure off the vanity JCCP to reform its publication practices. Editors can perceive when studies are likely to be failed attempts at replications or null findings and issue desk rejections for manuscripts with a standard form letter suggesting resubmitting to the Journal of Psychotherapy Integration.
  8. pottery barn ruleProviding such a ghetto is APA’s alternative to acceptance of a Pottery Barn rule, whereby if JCCP publishes a clinical trial, it incurs an obligation to publish attempted replications, regardless of whether results are consistent with the study being replicated.
  9. Without journal reform, publication in JCCP represents a biased sampling of evidence for particular psychotherapies with a strong confirmation bias.

Clinical psychology doesn’t need such silliness

Initiatives such as this call for papers are a distraction from the urgent need to clean up the clinical psychology literature. We need to confront directly  JCCP‘s policy of limiting publication to articles that are newsworthy and that claim to be innovative, at the expense of being robust and solid clinical psychological science.

Some personality and social psychologists involved in the replication initiative have received recognition and endorsement from the two professional organizations competing for the highest impact factors in psychology, Association for Psychological Science and American Psychological Association. Those of us who’ve continue to call in the social media for reform of the vanity journals, are often met with a flurry of negative response from the replicators who praise the professional organizations for their commitment to open psychological science

Have the replicators sold out the movement to reform psychology by leaving the vanity journals intact? As I’ve argued elsewhere, compromises worked out for replicability project may adversely affect efforts to improve the trustworthiness of clinical  psychologicalscience, even if the stakes are higher.




Mindfulness-based stress reduction for improving sleep among cancer patients: A disappointing look

insomniawomansleepingContinuing to probe studies of mindfulness-based stress reduction (MBSR) for health problems, I turned to some contradictory claims that an investigator had made about her trial of MBSR for improving the sleep of cancer patients.

  • I noticed things in a CONSORT flowchart in the article that the editor and reviewers should have flagged as a serious limitation of the study and one noteworthy of acknowledgment and discussion.
  • What I saw undercut the validity of complicated statistical analyses on which the author’s claims depended, as well as any credibility to claims about efficacy of MBSR.
  • Promoters of MBSR desperately need to demonstrate that the treatment is as good as or better than alternatives. This study does not contribute credible, favorable evidence, despite being dressed up to do so.
  •  It is time to attach an expression of concern to MBSR studies:

warning Warning! Likely to contain exaggerations and distortions favoring MBSR. Not suitable as a basis for decision-making as to whether to seek, provide, or commit public resources to MBSR.

There’s a growing sense that claims about MBSR are overblown and based on spun and low-quality evidence largely generated by enthusiasts and promoters with undeclared conflicts of interest. I had thought, though, that someone who is motivated but not caught up in all the fanfare could come to an independent judgment of the available literature.  Well, it takes too much work.

I’m losing confidence that anyone can evaluate MBSR studies without a concerted effort to cut through hype and hokum,  probing to a level of detail that the quality of evidence ultimately does not justify.

Simply put, it takes too much effort for outsiders – researchers, clinicians, and patients – to grasp how they are being misled by the mindfulness literature.

What we don’t know about MBSR for sleep problems

A comprehensive systematic review and meta-analysis prepared for the US Agency for Healthcare Research and Quality (AHRQ):

Goyal M, Singh S. Sibinga EMS, et al. Meditation programs for psychological stress and well-being: a systematic review and meta-analysis. JAMA Intern Med. Epub Jan 6 2014. doi:10.1001/jamainternmed.2013.13018.

Reviewed 18,753 citations, and found only 47 trials (3%) with 3515 participants that included an active control treatment.

The dismal conclusion:

 We found low evidence of no effect or insufficient evidence of any effect of meditation programs on positive mood, attention, substance use, eating habits, sleep, and weight.

The results of the study that I’m going be discussing became available after the systematic review. The primary outcome paper was published in a prestigious journal. Maybe, I had hoped, it could represent a sorely needed contribution to the limited evidence available for strong claims that MBSR is a cure for whatever ails you. No, it was not.

staying awakeBut why should we expect MBSR to improve sleep? A Buddhist neuroscientist expressed doubt.

Willoughby Britton, PhD is a clinical psychologist, neuroscience researcher, and Buddhist practitioner.  As Assistant Professor of Psychiatry and Human Behavior at Brown University Medical School, she specializes in research on meditation in education and as treatment for depression and sleep disorders. She was interviewed by Tricycle, a respected magazine of Buddhist thought that has been around since 1990. She was asked:Is the data better for some applications of meditation than others?”

I have done very careful reviews of the efficacy of meditation in two areas in which there are high levels of popular misconception about how much data we have: sleep and education. The data for sleep, for example, is really not that strong. And the AHRQ article concurs: it judges the level of evidence for meditation’s ability to improve sleep as “insufficient.”

What I found from my study was that meditation made people’s brains more awake. From a very basic brain point of view, what happens in your brain when you fall asleep? The frontal cortex deactivates. Nobody agrees what meditation does to the brain, but across the board, one of the most common findings is that meditation increases blood flow and activity in the prefrontal cortex. So how is that going to improve sleep? It doesn’t make any sense. It is completely incompatible with sleeping if you are doing it right. And we know that people stop sleeping when they go on retreats. That is never reported in scientific publications, even though it is well known among practitioners.

A tale of a study of MBSR and CBT to improve sleep problems in cancer patients thrice told.

 The primary report of the study appeared in the prestigious Journal of Clinical Oncology:

Garland, S. N., Carlson, L. E., Stephens, A. J., Antle, M. C., Samuels, C., & Campbell, T. S. (2014). Mindfulness-based stress reduction compared with cognitive behavioral therapy for the treatment of insomnia comorbid with cancer: A randomized, partially blinded, noninferiority trial. Journal of Clinical Oncology, JCO-2012.

The article concluded:

 Although MBSR produced a clinically significant change in sleep and psychological outcomes, CBT-I was associated with rapid and durable improvement and remains the best choice for the nonpharmacologic treatment of insomnia.

A conference abstract reporting the study published the same year concluded:

While both CBT-I and MBSR produced significant improvement in sleep and psychological outcomes, a more rapid change occurred in CBT-I.

The principal investigator’s review of her own work  published two years later concluded:

These findings indicated that while MBCR was slower to take effect, it could be as effective as the gold-standard treatment for insomnia in cancer survivors over time.

Delving into the details of the study

 From the abstract:

This was a randomized, partially blinded, noninferiority trial involving patients with cancer with insomnia recruited from a tertiary cancer center in Calgary, Alberta, Canada, from September 2008 to March 2011. Assessments were conducted at baseline, after the program, and after 3 months of follow-up. The noninferiority margin was 4 points measured by the Insomnia Severity Index. Sleep diaries and actigraphy measured sleep onset latency (SOL), wake after sleep onset (WASO), total sleep time (TST), and sleep efficiency. Secondary outcomes included sleep quality, sleep beliefs, mood, and stress.


Of 327 patients screened, 111 were randomly assigned (CBT-I, n _ 47; MBSR, n _ 64). MBSR was inferior to CBT-I for improving insomnia severity immediately after the program (P < .35), but MBSR demonstrated noninferiority at follow-up (P <.02). Sleep diary–measured SOL was reduced by 22 minutes in the CBT-I group and by 14 minutes in the MBSR group at follow-up. Similar reductions in WASO were observed for both groups. TST increased by 0.60 hours for CBT-I and 0.75 hours for MBSR. CBT-I improved sleep quality (P < .001) and dysfunctional sleep beliefs (P <.001), whereas both groups experienced reduced stress (P < .001) and mood disturbance (P< .001).

[For more information about a noninferiority  trial (NI), see here.]


The objective of non-inferiority trials is to compare a novel treatment to an active treatment with a view of demonstrating that it is not clinically worse with regards to a specified endpoint.

Investigators commit themselves to a pre-set difference between the two interventions that would satisfy them that the treatment was inferior, if they found it. In an earlier blog post, I noted that NI RCTs have a reputation for methodological flaws and bias:

An NI RCT commits investigators and readers to accepting null results as support for a new treatment because it is no worse than an existing one. Suspicions are immediately raised as to why investigators might want to make that point.

The trial had no control group from which it could be determined whether the benefits of either intervention exceeded what would be obtained with a nonspecific treatment that had no active ingredient beyond positive expectations, support, and attention.

The results of the trial were analyzed both intent to treat and per protocol. The intent-to-treat analyses included all patients who where randomized, regardless of the extent to which they actually attended treatment. The per-protocol analyses included only patients who attended at least 5 sessions.

The description of the analyses are likely to dazzle most readers and impress them that the authors knew what they were doing in applying sophisticated techniques- that is, if readers are unfamiliar with these techniques and the assumptions they make.

For each of the models, the random effect was participant, and the fixed effects were group (MBSR or CBT-I), time, baseline value, and the group-time interaction. Time was also set as a repeated measure. The restricted maximum likelihood estimate method was used to estimate the model parameters and SEs with a compound symmetry covariance structure to account for the correlation between measurements. We used type III fixed effects (F and t) and set the statistical significance of P values at P<.05.

A skeptic would figure out that the authors probably had to contend with a lot of missing data.


 This excerpt from the CONSORT flow chart tracks what happened after random assignment to either CBT or MBSR.

 consort flow attrition.PNG

Only half of the patients assigned to MBST actually attended the pre-set minimal number of sessions. Of 64 patients, 22 withdrew, another 2 attended no sessions, and 8 attended less than 5. CBT fared considerably better, with only 7 patients either not attending or withdrawing.

At 5 month follow up, the situation for MBSR worsened, only 27 patients – a minority of patients who had been randomized – were left to provide data for analysis.

The authors adopted their complex analytic strategy to compensate for missing data. The  strategy involves basically using all available data to guess what the results would have been for individual patients if their data had been available. Yup, they were inventing data based on a best guess.

These sophisticated techniques are only valid if most data for most patients remain available and the assumption can be made the loss of patients is random. But in this case, loss was not random, patients assigned to MBSR were less likely to stick around. We are not in position to know, but there is undoubtedly other nonrandom loss


 Results that depend so much on guesstimates from so much missing data that are not reliable or generalizable.

The study started out small and got smaller because of patient attrition.

The study did not have a nonspecific  control group. Yet, judging from the rest of the literature, it is unlikely that a superiority of MBSR over nonspecific treatment could be demonstrated with such a small sample, and certainly with the sample left after attrition.

Most psychotherapy research experts would not expect such a small study to be able to detect a difference between two active treatments. So, calling this a “noinferiority trial” is a cop out that serves to hide the low likelihood of finding a difference.

Appreciate what the author is asking of us – that we revise our appraisal of  MBSR for insomnia from “weak or no evidence” to “equivalent to gold standard treatment” on the basis of this study. We are asked to do this based on what shrunk to an underpowered study in which most patients assigned to MBSR weren’t around for follow up and there is a heavy reliance on tortured , post hoc analyses of secondary outcomes. No, thank you.

To improve the credibility of their claims, MBSR desperately need to demonstrate that the treatment is as good or better than alternatives. This study is not a fair demonstration of that. The high rate of nonretention of patients after being assigned to MBSR should be quite troubling to anyone promoting MBSR for whatever ails you.