Guest blog post: Fact checking dubious therapy workshop adverts and presenters

This guest post is an excellent follow-up to my debunking of the neurononsense [1,2] used to promote psychotherapy trainings as being better and more closely tied to brain science than competitors. My good friend and neuroscientist Deborah Apthorp* wrote it, prompted by an email advertisement she received by way of her University staff email list. She takes a skeptical look of advertisements for workshops with wild claims from presenters whose credentials prove dubious when checked against information available on the internet. She presents evidence for skepticism about workshop offerings and demonstrates a set of fact checking strategies useful to anyone who might be looking for continuing education credit or simply enhancing their ability to serve their clients/patients.

It is disconcerting that these workshops are marketed in university settings and seem to offer continuing education credit. I have trouble believing that academics will be attracted by the outrageous claims, and at least some of the credentials claimed by presenters should trigger academics’ skepticism.

But some of the target clinician audience are masters level or holders of practitioner-oriented PsyDs and PhDs. Their training did not include the critical skills needed to evaluate claims about research. And an unknown, but large part of the audience are not formally trained, licensed, or regulated therapists. They get by in a shadowy area of practice as coaches or counselors or go by other terms that sound like they are licensed, but they aren’t. These practitioners don’t have to answer to regulatory bodies that set minimal levels of training and expertise or even ethical constraints on what they can do. Instead, they acquire dubious certifications from official sounding boards set up and controlled by those profiting from the workshops. The certification does not really mean much outside the world of the workshops, but is intended to convey status and confidence to unsuspecting people seeking services. And those who attend these workshops can find themselves in a ponzi scheme where they have to attend more workshops to maintain their level of credentialing, and not be able to get credit by other means.

Here’s Deborah…

DebSo this lobbed into my inbox today, via the official channel of our staff email list:

Mindfulness, Neuroscience and Attachment Theory: A Powerful Approach for Changing the Brain, Transforming Negative Emotions and Improving Client Outcomes

The course costs $335.00 and is available in all of Australia’s major capital cities. It’s being held at mostly conference-centre type venues, so presumably they’re expecting pretty big numbers. There are some pretty big promises here, and as a neuroscientist my alarm bells immediately started ringing.

“….advances in neuroscience and attachment theory have led to revolutionary work in the application of mindfulness in the treatment of anxiety, depression emotional dysregulation, anger and stress.”

“…In this seminar, we will explore an integrated approach — incorporating advances in neuroscience, new insights about attachment theory and The Five Core Skills of Mindfulness — that accelerates healthy change and improves client outcomes.”

“…Take home cutting-edge information on the interface between neuroscience, mindfulness and therapy. “

Is this workshop endorsed by the APS? Apparently not, though the organisers are somewhat evasive about it: …”APS: Activities do not need to be endorsed by APS. Members can accrue 7 CPD hours by participating in this activity”

So who is this Terry Fralich (LCPC)? (And what does that stand for? Licensed Clinical Professional Counsellor, apparently, although it’s not clear which body did the licensing.) According to the official website, “Terry Fralich is an adjunct faculty member of the University of Southern Maine Graduate School and a Co-Founder of the Mindfulness Centre of Southern Maine.” However, although it seems that there is a Ms. Julie Fralich listed on the official University of Southern Maine faculty list , there is no Terry Fralich listed. The only mention at all on the website is his wife Rebecca Wing (a co-presenter at the workshops and co-founder of the Mindfulness Center – see below), who is an alumnus of their School of Music (class of ’84).

He does show up on a lot of sites about mindfulness, the top hit being his “Mindfulness Retreat Center of Maine”, which showcases its lovely views and comfortable accommodation (prices are available on application). They also sell “Books and CDs”, although the only actual book listed is Mr. Fralich’s book “Cultivating Lasting Happiness – a 7-step guide to mindfulness”. According to Amazon, this seems to have been the only book he has written (reviews are generally positive, though one reader found it did not cover any new ground). It seems to be a pretty standard practical guide to mindfulness meditation – nothing wrong with that in itself, I guess.

So where are this guy’s credentials in neuroscience and attachment theory? A search on Google Scholar turned up only the aforementioned book, but no academic papers. His only relevant qualification seems to be a Masters Degree in Clinical Counselling (although I could not find out where this qualification was obtained – if anyone knows, mention it in the comments). Apparently he has studied with the Dalai Lama for more than 25 years; according to his website, “Prior to becoming a mindfulness therapist, academic and counsellor, Terry was an attorney who practiced law in New York City, Los Angeles and Portland, Maine.” I guess this experience should make him careful about making claims which can’t be verified.

Here’s a YouTube teaser for one of his lectures.

I also found a link to a PDF for the program.

It incorporates sciencey-sounding things like “The triune brain” (huh?), “Fight-or-flight-or-freeze and stress responses”, and of course today’s essential buzzword, “neuroplasticity”. A particularly scary phrase is “Reconsolidation of negative memories: transforming unhealthy patterns
and messages.” How are they going to teach therapists to do this – these people who have no training at all in neuroscience, attachment theory, memory or indeed, it seems, even CBT?

tatraDelving a little deeper, I had a look at the list of trainers on Tatra Training’s website. It seems that a number of them are associated with an organization called the Dialectical Behaviour Therapy National Certification and Accreditation Association (DBTNCAA), allegedly “the first active organisation to certify DBT providers and accredit DBT programs.” – notably, the appropriately-named Dr. Cathy Moonshine (alcohol and chemical dependency treatment counselor), and Lane Pederson (PsyD, President/CEO). However, this organization is not in any way endorsed by the founder of Dialectical Behaviour Therapy herself, Marsha Linehan. In fact, there is a disclaimer on Cathy Moonshine’s site to this effect:

“All trainings, clinical support, and products sold by Dr. Moonshine are of her own creation without collaboration with Dr. Linehan, or Dr. Linehan’s affiliated company, Behavioral Tech, LLC. Dr. Moonshine’s products are not sanctioned by, sponsored, licensed, or affiliated with Dr. Linehan and/or Behavioral Tech, LLC.”

Thus, it seems Linehan herself has a competing company, but she does at least have an impressive CV with many research articles to back her up. I attempted to contact her for comment on the DBTNCAA and Tatra Training, but she has not yet repled.

Let’s have a look at some of the other “trainers” and their biographies. Dr. Daniel Short is listed as a Faculty Member at Argosy University, a for-profit college in Minnesota that has changed its name and is at now being sued by former students for fraud.

Dr Gregory Lester’s biography claims that he has published papers in “The Journal of the American Medical Association, The Western Journal of Medicine, The Journal of Marriage and Family Therapy, The Journal of Behaviour Therapy, Emergency Medicine News, The Yearbook of Family Practice, The Transactional Analysis Journal, and The Sceptical Inquirer”. But a PubMed search reveals none of these publications. An online list from his own website reveals very few relevant publications, and does not even include all of the outlets listed above; instead, there are things like “Dealing with the Difficult Diner”, in Restaurant Hospitality, and “Dealing with personality disorders” in The Priest Magazine. In addition, there are several books, which you can presumably buy at his workshops.

Interestingly, his bio also states that “… he has specialised in Personality Disorders for over 25 years, and has been a participant in multiple studies that form the basis for the DSM V revision of the section on Personality Disorders.” He has participated in these studies? Was he a control, or does he have a personality disorder himself? Because he certainly wasn’t an author on any of these studies.

Dr Brett Deacon seems to check out OK. Surprising to find him associated with this bunch.

Dr Daniel Fox is said to be the author of “numerous articles on personality, ethics, and neurofeedback”, but only one on neurofeedback (in the Journal of Applied Psychophysiology and Biofeedback) turned up in PubMed. This seems to be a review rather than an original research piece. An author search on “Fox, DJ [AU] and Ethics” returned no hits, and neither did “personality”, He also seems to have a book, which comes with optional seminar bundles!

Jerold J Kreisman is another of the Tatra stars and seems to feature as the Borderline Personality expert. The site claims breathlessly that he ‘…has appeared on many media programs, including The Oprah Winfrey and Sally Jesse Raphael Shows. He has been listed in “Top Doctors,” “Best Doctors in America,” “Patients’ Choice Doctors,” and “Who’s Who.”’. It also, more seriously, claims he has published “over twenty articles and book chapters”; however, PubMed only turns up four publications, one if which is from 1975 so is probably not by the same JJ Kreisman. Of the three remaining publications, only one 1996 paper (on which he is third author) is related to BPD, and this seems subject to an erratum (though the erratum itself seems impossible to find; the article seems to have been subject to a Letter to the Editor, which also is hard to find.). The only relevant publications seem to be, again, pop-psych-style books with titles like I Hate You–Don’t Leave Me: Understanding the Borderline Personality.

What about Ronald Potter-Efron, the facilitator of Healing the Angry Brain: Changing the Brain & Behaviours of Angry, Aggressive, Raging & Domestically Violent Clients? Again, he is a prolific author of self-help books. Google Scholar and Scopus do turn up about five academic publications, all from the late 1980s and early 1990s. Since then, he seems to have turned to the more lucrative self-help industry.

So all in all, it seems the Tatra Training people work via a fairly aggressive marketing campaign to clinical psychology academic departments and, presumably, clinicians themselves, as well as other “Corporate and Allied Health” practitioners. Their main address is in Adelaide, South Australia; Google Street View shows an anonymous-looking office block. Tatra was founded by Hanna Nowicki (LLB, BA Psych., Postgrad. Soc. Admin, Cert IV Training & Workplace Assessment), who seems to have no qualification in psychology other than her B.A. Psych, although this hasn’t stopped her developing and presenting “…multiple workshops on personality disorders, self injury, suicide risk assessment, depression, engagement techniques and introduction to mental health.”

Disturbingly, the list of clients includes many government organisations such as Centrelink, Correctional Services, Housing SA, Worklink Queensland, and more nebulously-named organisations such as “Residential Care Services”, “Brain Injury and Disability Services”, “Public Mental Health Services”, “Hospital Social Work Departments”, and so on.

How much money are these people making per workshop? Well, if the Sydney venue is anything to go by, the Wesley Conference Centre in Sydney seats 875 people, so if that sells out at $335 a head that’s $293,125. (The Wesley Centre does have smaller venues, so perhaps the organisers aren’t expecting such a large crowd. Their general preference for booking conference centres and Leagues Clubs, though, suggests that they are.) If the workshop is held in all five major cities (assume most are smaller than Sydney, so let’s be conservative and assume gross takings of $200,000 per workshop), then that’s a tidy sum ($1m per workshop, so $2m per year if only 2 workshops are held, as in 2014 and 2015). Of course one must subtract venue hire, advertising costs, speaker fees, catering, etc. etc., but all the same this seems quite a promising business model, particularly when combined with the in-house training offered.

I am concerned that these people are pushing a product that is not what is advertised, and claiming to be experts when they are not, sometimes supported by what seem to border on fraudulent claims. I am concerned that naïve young mental health professionals, looking for accreditation hours, are being fed misleading information that is not based on scientific evidence. If anyone has direct experience of these workshops, I would be very interested to know about it.

*Deborah Apthorp is a neuroscience researcher working at the Australian National University in Canberra, Australia. She holds an NHMRC Early Career Fellowship, and is interested in EEG, visual neuroscience, visual attention and the dynamics of postural control. In addition to this, she is a keen sailor, cyclist and windsurfer, and a passionate supporter of the Open Science movement.

Here’s her Google Scholar page and her own WordPress blog.

Talking back to the authors of the Northwestern “Blood test for depression” study

translational psychiatry[Update 9/25/2014]This post critiques the press coverage of a recent article in Translational Psychiatry concerning whether a blood test for depression would soon be available. A critique of the bad science of the article itself is now available at PLOS Mind the Brain.

Judging from the extraordinary number of articles in the media, as well as the flurry of activity on Twitter, a recent study coming out of Northwestern University is truly a breakthrough in providing a blood test for depression.

Unfortunately, the many articles in the media have a considerable, almost copy/paste redundancy. Just compare them to the Translational Psychiatry article’s press release. In many instances, there is more than churnalism going on, there is outright plagiarism. Media coverage offers very few demurs or dampening qualifications on what the authors claim. How do journalists put their names on such lack of work?

Similarly, the tweets appear to be retweets of just a couple of messages, although few are labeled as retweets.

I had my usual doubts as to whether the journalists or tweeters have actually read the article. Journalists could always have gone for second opinions to Google Scholar and looked up similarly themed articles and then maybe contacted the authors of similar articles for comments. Journalists could also have loaded the abstract of the Translational Psychiatry article into EtBlast and gotten dozens of recommendations for relevant experts based on text similarity. I see no evidence that this was done.etblast

There must be something intimidating about an article that claims to be testing not for genes, but for gene transcripts associated with depression. Shuts down the critical faculties. Lacking relevant expertise, journalists and tweeters may be inclined to simply defer to the claims of the authors and not further scrutinize the text or tables of the article with whatever relevant knowledge they do have. If this had been done, they might have found things that they could understand that would be very relevant to evaluating the credibility of this article.

Almost all of the hype that is been written about this Translational Psychiatry article originates with its authors, either in the article itself, the press release, or the well-crafted soundbites provided to the media. Yet, some of the latter are simply excerpted from the press release and made to look like the quotes arose in an interview. I promised a full thorough demolition of the article, and that will be forthcoming. However, here, I will analyze some of the statements attributed to two of the authors in the press. There is a fascinating logic, an ideology even to the statements that is of interest in itself. But you also can take this blog post as a teaser for a soon to arrive blog post at PLOS Mind the Brain in the next week or two.

Keep in mind as we scrutinize what the authors say about their study, just how modest it is. The study started by comparing 32 primary care patients participating in a clinical trial to 32 control persons match for age, ethnicity/race, and gender. Five of the primary care patients were lost to follow-up and another five were lost from the 18 month blood draws. Of these last 22 remaining patients, nine were classified in remission of their depression, and 13 not in remission.

So basically we are talking about some exceedingly small samples and comparisons of subsamples. These shrink to a comparison of 9 patients in remission and 13 not in remission for any statements about prediction of treatment outcome. In any other context, how could anyone who knows anything about clinical research accept the results of such analyses?

Furthermore, if we want to talk about any differences observed at baseline versus what was seen at follow-up, it could well be attributed to simple selective loss to follow-up. This is just one of the many alternative explanations of results reported for these data that cannot be adequately tested because of the small sample sizes. The articles talks about utilizing multivariate statistical controls, but that is statistical malpractice in a sample this size that is highly likely to produce spurious findings.

The authors make a number of statements about predicting remission from cognitive behavior therapy, but from the beginning of the study and into follow-up, all of the patients were getting cognitive behavior therapy and considerable proportion getting antidepressants as well. That is no small complication. It is generally assumed that predictors of response to antidepressants should be different than predictors of response to psychotherapy, but there is really no opportunity to examine this within this confounded small sample.

The two authors quoted by name in media coverage are

Eva RedlEva Redei, PhD, Professor in Psychiatry and Behavioral Sciences and Physiology at Northwestern’s Feinberg School of Medicine in Chicago.

David.mohrDavid C. Mohr, PhD, Professor of Preventive Medicine and Director of the Center for Behavioral Intervention Technologies at the Feinberg School of Medicine at Northwestern University.

From an article in Medscape, Blood Test Flags Depression, Predicts Treatment Response:

We were pleased with these findings, including finding biomarkers that continued to be present after people were effectively treated,” co–lead author David C. Mohr, PhD, professor of preventive medicine and director of the Center for Behavioral Intervention Technologies at the Feinberg School of Medicine at Northwestern University in Chicago, Illinois, told Medscape Medical News.

Dr. Mohr noted that essentially, these are markers of traits ― and may show that certain people have a predisposition to the disorder and can be followed more carefully.

Maybe, maybe not, Dr. Mohr. Aside from your modest sample size and voodoo statistics, it is unclear how clinically useful a trait marker would be. After all, we already have a trait marker in neuroticism, and while it is statistically predictive, it does not do all that well in terms of clinical applications. And the alternative of course is simply to have a discussion with patients as the particular symptoms they have and whether alternative explanations can be ruled out.

Recall, Dr. Mohr, this “trait marker” as you assumed it to be, is occurring in the mildly to moderately depressed sample. Clinical depression is a recurring episodic condition, and this “trait” is not going to be expressing itself in a full-blown episode much of the time.

“Abundance of the DGKA, KIAA1539, and RAPH1 transcripts remained significantly different between subjects with MDD and…controls even after post-CBT remission,” report the investigators.

Well, maybe, but it seems a stretch to make such claims from such limited evidence. The 3 transcripts remaining significant after remission are based on the 9 patients who remitted. Three is different than the 9 of 20 transcripts that differed at baseline, but we don’t know if this is a matter of loss to follow up or remission. And even this reduced number of significant differences, 3, is still statistically improbable, given the small sample size, even assuming an effect is present. The authors have no business interpreting their data to the press in this fashion.

In addition, these transcripts “demonstrated high discriminative ability” between the 2 groups, regardless of their current clinical status, thus appearing to indicate a vulnerability to depression.

The authors have no business claiming to have demonstrated “high discriminative ability” with such a small sample. Notoriously, such findings do not replicate. There is always a drop in the performance statistics from such a small sample when replication is attempted in nine seconds sample. Comparison with an earlier paper, reveals that the authors have not even replicate the findings from their earlier study of early onset depression in the present one and that does not bode well.

“This clearly indicates that you can have a blood-based laboratory test for depression, providing a scientific diagnosis in the same way someone is diagnosed with high blood pressure or high cholesterol,” said Dr. Redei.

Maybe someday we will have a blood-based laboratory test for depression, but by themselves, these data do not increase the probability.

“Clinically, simplicity is important. The primary care setting is already completely overburdened. The more we can do to simplify the tasks of these caregivers, the more we’re going to be able to have them implement it,” said Dr. Mohr.

Of all the crass, premature and inaccurate statements I find in this article, this one tops the list. Basically, Dr. Mohr is making a pitch that the blood test he is promoting will free primary care clinicians from having to talk to their patients. All they need to do is give the blood test and prescribe antidepressants.

From A Blood Test for Depression Shows the Illness is not a Matter of Will

“Being aware of people who are more susceptible to recurring depression allows us to monitor them more closely,” said David Mohr, Ph.D., co-lead author of the study in a press release. “They can consider a maintenance dose of antidepressants or continued psychotherapy to diminish the severity of a future episode or prolong the intervals between episodes.”

This advice is not only premature, it is inappropriate for a mild to moderately depressed sample treated in primary care, where monitoring and follow-up are either nonexistent or grossly inadequate. Dr. Mohr’s suggestion if it were taken seriously, would lead to overdiagnosis and overtreatment or prolonged treatment without follow-up and him and re-evaluation.

In general, these authors seem cavalier in ignoring the problems of overdiagnosis. Elsewhere, Dr. Redei is asked about it and gives a flippant response:

There’s a lot of concern about overdiagnosis for psychiatric illnesses already. How do you think your findings might affect that issue?

[Dr. Redl] People who worry about overdiagnosis — they are probably right, and they are probably wrong. Because there is potentially a problem with underdiagnosis, too. In the elderly, for example – we say, “Oh, you’re just old. You don’t have any energy, and you don’t want to do anything — you’re just old.”

From Blood Test Spots Adult Depression: Study

The blood test’s accuracy in diagnosing depression is similar to those of standard psychiatric diagnostic interviews, which are about 72 percent to 80 percent effective, she said.

It is irresponsible rubbish to claim that the study showed that these measures of gene expression were as accurate as current interview methods. The study involved comparing 20 different measures of gene expression to an interview by a bachelor level interviewer using a less than optimal interview schedule that did not allow for explain any questions or probe of the patient’s response. It certainly would not of been allowed in a study for which the data were to be submitted to the US FDA (FDA). And there was no gold standard beyond that.

Additionally, if the levels of five specific RNA markers line up together, that suggests that the patient will probably respond well to cognitive behavioral therapy, Redei said. “This is the first time that we can predict a response to psychotherapy,” she added.

Again, Dr. Redei, you are talking trash that is not justified by the results of your study. The sample is quite small and most of the patients who receive cognitive behavior therapy also received medication.

The delay between the start of symptoms and diagnosis can range from two months to 40 months, the study authors pointed out.

“The longer this delay is, the harder it is on the patient, their family and environment,” said lead researcher Eva Redei, a professor in psychiatry and behavioral sciences and physiology at Northwestern’s Feinberg School of Medicine in Chicago.

“Additionally, if a patient is not able or willing to communicate with the doctor, the diagnosis is difficult to make,” she said. “If the blood test is positive, that would alert the doctor.”

Perhaps, Dr. Redei, you need to be reminded that you are studying mildly to moderately depressed primary care patients, not an inpatient or suicidal sample. What is the hurry to treat them? Current guidelines in much of the world have become conservative about initiating treatment too quickly. In both the United Kingdom and the Netherlands, there is a recommendation for first trying watchful waiting, simple behavioral activation homework, or Internet-based therapy before starting something more intensive like antidepressant therapy or psychotherapy. Certainly if a patient has multiple recurrent episodes or a history of sudden suicidality, a different strategy would be recommended.

And in what clinical situations does Dr. Redei imagine having to initiate treatment when a patient is not able or willing to communicate with the doctor? Would treatment be ethical under those circumstances and how would it receive the necessary monitoring?

From: First ‘Blood Test for Depression’ Holds Promise of Objective Diagnosis

“Currently we know drug therapy is effective but not for everybody and psychotherapy is effective but not for everybody, ” Mohr said. “We know combined therapies are more effective than either alone but maybe by combining therapies we are using a scattershot approach. Having a blood test would allow us to better target treatment to individuals.

Again, Dr. Mohr, this is a widely shared hope by some, but your current study in no way advances us further to achieve this hope of a clinical tool.

In all of the many media stories available about the study, there was little dissent or skepticism. One important exception was

Newsweek’s First ‘Blood Test for Depression’ Holds Promise of Objective Diagnosis

Outside experts caution, however, that the results are preliminary, and not close to ready for use the doctor’s office. Meanwhile, diagnosing depression the “old-fashioned way” through an interview works quite well, and should only take 10 to 15 minutes, says Todd Essig, a clinical psychologist in New York. But many doctors are increasingly overburdened and often not reimbursed for taking the time to talk to their patients, he says.

Essig says it’s “a nice little study” but has no clinical usefulness at this point. That’s because it involved such a small sample of people and because the researchers excluded many patients that real clinicians would see on a daily basis, he says.

“It’s moving basic knowledge incrementally forward, but its way to soon to say it’s a ‘blood test for depression,’” Essig says.

“Depression is not hard to diagnose, even in a primary care setting,” he adds. “If physicians were allowed by health-care delivery systems to spend more time talking with their patients there would be less need for such a blood test.”

Amen, Dr. Essig, well put.

Northwestern Researchers Develop RT-qPCR Assay for Depression Biomarkers, Seek Industry Partners

One has to ask why would these mental health professionals disseminate such misleading, premature, and potentially harmful claims? In part, because it is just quite fashionable and newsworthy to claim progress in an objective blood test for depression. Indeed, Thomas Insel, the director of NIMH is now insisting that even grant applications for psychotherapy research include examining potential biomarkers. Even in the absence of much in the way of promising, clinically useful biomarker candidates, there are points to be scored in grant applications that cite pilot work moving in that direction, regardless of how unjustified the claims are. As John Ioannidis has pointed out, fashionable areas of research are often characterized by more hype and false discoveries than actual progress.

However, comments in one article clearly show that these authors are interested in the commercial potential of their wild claims.

Now, the group is looking to develop this test into a commercial product, and seeking investment and partners, Redei said.

“The goal is to partner to move this as far as possible into the clinic,” Redei said. “There are [other assays] coming behind it, so I would like to focus on [those] … but then this one can move on. For that, I absolutely need partners [and] money, that’s the bottom line,” she said.

Redei envisions developing this assay into a US Food and Drug Administration-approved diagnostic, rather than a laboratory-developed test. “If it’s FDA approved, then any laboratory can do it,” she said.

“I hope it is going to result in licensing, investing, or any other way that moves it forward,” she said. “If it only exists as a paper in my drawer, what good does it do?”

 

 

Most positive findings in psychology false or exaggerated? An activist’s perspective

Abstract of a  talk to be given at the Australian National University, room G08, Building  39, 3pm September  11, 2014.

confirmation biaIn 2005, John Ioannidis made the controversial assertion in a now famous PLoS Medicine paper that “Most Published Research Findings are False”. The paper demonstrated that many positive findings in biomedicine subsequently proved to be false, and that most discoveries are either not replicated or can be shown to be exaggerated. The relevance of these demonstrations was not appreciated until later in psychology.

Recent documented examples of outright fraud in the psychological literature have spurred skepticism. However, while outright fraud may be rare, confirmatory bias and flexible rules of design and analysis are rampant and even implicitly encouraged by journals seeking newsworthy articles. Efforts at reform have met with considerable resistance, as seen in the blowback against the replicability movement.

This talk will describe the work of one loosely affiliated group to advance reform by focusing attention not only on the quality of the existing literature, but on the social and political processes at the level of editing and reviewing. It will give specific examples of recent and ongoing efforts to dilute the absolute authority of editors and prepublication reviewers, and instead enforce transparency and greater reliance on post-publication peer review of claims and data.

Optional suggested readings (I suggest only one or two as background)

Is evidence-based medicine as bad as bad Pharma?

I am holding my revised manuscript hostage until the editor forwards my complaint to a rogue reviewer.

Reanalysis: No health benefits found for pursuing meaning in life versus pleasure.

A formal request for retraction of a cancer article

I reply to John Grohl’s “PLOS blogger calls out PLOS One –Huh?”

Apparently John Grohl was taken aback by my criticism of neurononsense in a PLOS One article.  I am pleased at gaining recognition at his highly accessed blog, but I think he was at least a bit confused about what was going on. The following comment is left at his blog post for approval.

plos oneJohn, thank you for encouraging people to read my blog post. If I ever get around to monetizing my activity by collecting my blog posts in a book, I will remind you that you said I know bad research when I see it and that even that I am brilliant. I will ask for a dust jacket endorsement or maybe a favorable review at Amazon.

Like you, I find it amazing that I was allowed free reign as a blogger to mount such an attack on an article that PLOS had published. I had filed a complaint with the journal over the undisclosed conflict of interest of one of the authors. I informed the managing editors that I would be merciless in my blog post in exposing what could found with meticulous reads and rereads of an article, once I was alerted by a conflict of interest. The journal is processing my formal complaint. Management indicated no interest in previewing my blog post, but only asked that I indicate it was my personal opinion, not that of the journal. I am confident that if this process were unfolding at a for-profit journal, or worse, one associated with a professional organization such as American Psychological Association or Association for Psychological Science, there would have been an effort to muzzle me, but little prospects for a full formal review of my complaint that respected the authors’ rights as well as my own.

sleuth_cartoon UsePLOS One requests disclosure of potential conflicts of interest in exquisite detail and accompanies every article with a declaration. It has an explicit procedure for reviewing complaints of breaches of its policies. I expressed my opinion in both in my blog post and in my complaint that the authors violated the trust of the journal and the readers by failure to make the disclosure of extensive financial benefits associated with an uncritical acceptance of the claims made in the article. But PLOS One does not go on the opinion of one person. Even when the complainant is one of its 4000 Academic Editors, it reviews the evidence and solicits additional information from authors. I am confident in the fairness of the outcome of that review. If it does not favor of my assessment, I will apologize to the authors, but still assert that I had strong basis for my complaints.

Like many people associated with PLOS, I have a great skepticism about the validity of prepublication review in certifying the reliability of what is said in published papers. My blog posts are an incessant effort to cultivate a skepticism in others and provide them with the tools to decide for themselves about whether to accept what they read. PLOS is different than most journals in providing readers with immediate opportunities to comment on articles in a way that will be available to anyone accessing them. I encourage readers to make more use ofPubMed Commons to those opportunities, as well as PubMed Commons for post publication peer review.

I am pleased and flattered that you think I laid out the problems in the article so bare that they are now obvious and should have precluded publication of the article. But it took a lot of work and lots of rereads and expertise that I alone do not possess. I got extensive feedback from a number of persons, including neuroscientists and I am quite indebted in particular to Neurocritic. I highly recommend his earlier blog post about this article. He had proceeded to the point of sniffing out that something was amiss. But when he called out Magnetto, the BS detector to investigate, he was thwarted by the lack of some technical details, as well has his inability to get into the down and dirty of the claims were being made about clinical psychology science. As we went back and forth building upon the revelations of the other, weMagneto_430 were both shocked and treated to aha experiences – why didn’t I notice that?

Initially, we trusted the authors citing a previous paper in Psychological Science for the validity of their methods and their choice of Regions of Interest (ROIs) of the brain for study. It took a number of reads of that paper to discover that they were not accurately reporting what was in that paper or the lack of correspondence to what was done in the PLOS paper. I consider the Psychological Science paper just as flawed and at risk for nonsense interpretations, but I have no confidence in APS or that journal’s tolerance for being called out on shortcomings in their peer review. I speak from experience.

Taken together, I consider my blog post and the flaws in the PLOS article that I targeted as indications of the need for readers to be skeptical about depending on prepublication peer review in evaluating the reliability of articles. And let us see the outcome of the formal review as to whether there is the self correction, if it is necessary, that I think we can depend on PLOS to provide.

Finally, let’s all insist on disclosure of conflicts of interest in every paper, not just those coming from the pharmaceutical industry. My documentation of the problems with promoters of Triple P Parenting have led to bar fights with some journals and even formal complaints to the Committee on Publication Ethics (COPE). Keep watching my blog posts to see the outcome of those fights. Disclosures of conflicts of interest depend on the candor of authors. I would have been a hypocrite if I did not call out the authors of a PLOS One article in the same way that I call out authors of publications in other journals.

igagged_jpg-scaled500PS. For the record, I quit blogging at Psychology Today because management changed one of my titles so as not to offend pharmaceutical companies.

 

Deconstructing misleading media coverage of neuroscience of couples therapy

Do we owe psychotherapists something more than noble lies and fairy tales in our translations of fMRI results?

hold me tightThe press release below was placed on the web by the University of Ottawa and refers to an article published in PLOS One. As I have noted in blog posts here and here, the PLOS One article is quite bad. But the press release is worse, and introduces a whole new level of distortion.

Comparing the article to the press release or my blog posts. You can get a sense of the nonsense in the press release. I will summarize the contradiction between these sources in my comments interspersed with excerpts from the press release.

True love creates resilience, turning off fear and pain in the brain

OTTAWA, May 1, 2014— New research led by Dr. Sue Johnson of the University of Ottawa’s School of Psychology confirms that those with a truly felt loving connection to their partner seem to be calmer, stronger and more resilient to stress and threat.

In the first part of the study, which was recently published in PLOS ONE, couples learned how to reach for their lover and ask for what they need in a “Hold Me Tight” conversation. They learned the secrets of emotional responsiveness and connection.

If you go to the PLOS one article, you will see no mention of any “Hold Me Tight” conversation, only that couples who were selected for mild to moderate marital dissatisfaction received couples therapy. The therapy was of longer duration than typically been provided in previous studies of Emotionally Focused Therapy (EFT). At completion, the average couple was still experiencing mild to moderate marital dissatisfaction and would still have qualified for entry into the study.

So, for a start, these were not couples feeling “loving connections to each other,” to the extent that the authors assume their quantitative measures are valid.

The second part of the study, summarized here, focused on how this also changed their brain. It compared the activation of the female partner’s brain when a signal was given that an electric shock was pending before and after the “Hold Me Tight” conversation.

The phrase “changed the brain” is vague and potentially misleading. It gives the false impression that there is some evidence that differences in fMRI results represent enduring, structural change, rather than transient, ambiguous changes in activity. Changes in brain activity does not equal change in structure of the brain. It seems analogous to suggesting that the air-conditioning coming on rearranged a room, beyond cooling it down temporarily. Or that viewing a TV soap opera changes the brain because it is detectable with fMRI.

Before the “Hold Me Tight” conversation, even when the female partner was holding her mate’s hand, her brain became very activated by the threat of the shock — especially in areas such as the inferior frontal gyrus, anterior insula, frontal operculum and orbitofrontal cortex, where fear is controlled. These are all areas that process alarm responses. Subjects also rated the shock as painful under all conditions.

Let us ignore that there is no indication in the PLOS One paper of a “Hold Me Tight” conversation. It is a gross exaggeration to say that the brain “became very activated.” We have to ask “Compared to what?” Activation of the brain is relative, and as the neuroscientist Neurocritic pointed out, there is no relevant comparison condition beyond partner versus stranger versus being alone. Nothing to compare them to and such fMRI data do not have anything equivalent to the standardization of an oven thermometer or a marital adjustment measure.. And the results are different than the press release would suggest. Namely,

In the vmPFC, left NAcc, left pallidum, rightinsula, right pallidum, and right planum polare, main effects ofEFT revealed general decreases from pre- to post- therapy in threat activation, regardless of whose hand was held, all Fs (1, 41.1 to 58.6) >3.9, all ps <.05. In the left caudate, left IFG, and vACC, interactions between EFT and DAS revealed that participants with the lowest pre-therapy DAS scores realized the greatest decreases from pre- to post-therapy in threat related activity, all Fs (1, 55.1 to 66.7) $6.2, all ps <. 02. In the right dlPFC and left supplementary motor cortex, interactions between handholding and EFT suggest that from pre- to post- therapy, threat-related activity decreased during partner but increased during stranger handholding, Fs (1, 44.6 to 48.9) = 5.0, ps = .03 (see Figure 5). [Emphasis added]

Keep in mind that these results are also derived from well over 100 statistical tests performed on data from 23 women and so they are likely due to chance. It is difficult to make sense of the contradictions in the results. By some measures, activation while holding both strangers and husbands’ hand decreased. Other differences were limited to the women with lower initial marital satisfaction.

It is also not clear what decreased activation means. It could mean that less thought processes are occurring or that thought processes take less effort. An fMRI is that ambiguous.

It is important to note what we are not told in the article. We are led by the authors to expect an overall (omnibus) test of whether changes in brain activity from before to after therapy will occur for when husbands’ hands are held, but not strangers or alone. It is curious that specific statistic is not reported where it should have been. It is likely that no overall simple difference was found, but the authors went fishing for whatever they could find anyway.

There is no mention in the paper of ratings of the shock as to degree of painfulness. There are ratings of discomfort.

We need to keep in mind that this experiment had to be approved by a committee for the protection of human subjects. If in fact the women were being subject to painful shock, the committee would not have granted approval.

The actual shock was 4 mA. I put a request out on Facebook for information as to how painful such a shock would be. A lab in Australia reported that in response, graduate assistants had been busy shocking themselves. W with that amperage could not produce a shock they would consider painful.

However, after the partners were guided through intense bonding conversations (a structured therapy titled Emotionally Focused Couple Therapy or EFT), the brain activation and reported level of pain changed —under one condition. While the shock was again described as painful in the alone and in the stranger hand holding conditions (albeit with some small change compared to before), the shock was described as merely uncomfortable when the husband offered his hand. Even more interesting, in the husband hand-holding condition, the subject’s brain remained calm with minimal activation in the face of threat.

Again, there are no ratings of painfulness described in the report of the experiment. The changes occurred in both husband and stranger handholding conditions.

The experiment explored three different conditions. In the first, the subject lay alone in a scanner knowing that when she saw a red X on a screen in front of her face there was a 20% chance she would receive a shock to her ankles. In the second, a male stranger held her hand throughout the same procedure. In the third, her partner held her hand. Subjects also pressed a screen after each shock to rate how painful they perceived it to be.

Here we are given a relevant detail. The women believed that they had a 20% chance of receiving a shock to their ankles. It is likely that the anticipation was uncomfortable, not the actual shock. The second condition is described as having their hand held by a male stranger. Depending on the circumstances, that could either be creepy or benign. Presumably, the “male stranger” was a laboratory assistant. That might explain why the actual results of the experiments suggest that after therapy, handholding by this stranger was not particularly activating of areas of the brain that it had been earlier.

But, the press release provides a distorted presentation of the actual results of the study. This presentation seems to indicate that the EFT it had occurred between the first and second fMRIs had produced an effect only for the condition in which the woman’s hand was held by a partner, not a stranger.

The actual results were weak and contradictory. They do not seem to be overall effects for free versus post therapy fMRI. Rather, effects were limited to a subgroup of women who had started therapy with exceptionally low marital satisfaction and persisted after they had therapy. The changes in brain activation associated with having their handheld by a partner were not different than a changes for having their hand held by a stranger.

These results support the effectiveness of EFT and its ability to shape secure bonding. The physiological effects are exactly what one would expect from more secure bonding. This study also adds to the evidence that attachment bonds and their soothing impact are a key part of adult romantic love.

How this could be accurate? The women did not have a secure bonding with the stranger, but their brain activation nonetheless changed. And apparently this did not happen for all women, mostly only those with lower marital satisfaction at the beginning of therapy.

From the press release, I cannot reconstruct what was done and what was found in the study reported in PLOS One. A lot of wow, a lot of shock and awe, but little truth.

Surely, you jest, Dr. Johnson.

Tools for Debunking Neuro-Nonsense About Psychotherapy

nonsenseThe second in my two-part blog post at PLOS Mind the Brain involves assisting readers to do some debunking of bad neuoscience for themselves. The particular specimen is neurononsense intended to promote emotionally focused psychotherapy (EFT) to the unwary. A promotional video and press releases drawing upon a PLOS One article were aimed to wow therapists seeking further training and CE credits. The dilemma is that most folks are not equipped with the basic neuroscience to detect neurobullocks. Near the end of the blog, I provide some basic principles for cultivating skepticism about bad neuroscience. Namely,

Beware of

  • Multiple statistical tests performed with large numbers of fMRI data points from small numbers of subjects. Results capitalize on chance and probably will not generalize.
  • The new phrenology, claims that complex mental functions are localized in single regions of the brain so that a difference for that mental function can be inferred from a specific finding for that region.
  • Glib interpretations that if a particular region of the brain is activated. It may simply mean that certain mental processes are occurring. Among other things, he could simply mean that these processes are now taking more effort.
  • Claims that changes in activation observed in fMRI data represent changes in the structure of the brain or mental processes. Function does not equal structure.

But mainly, I guided readers through the article calling attention to anomalies and just plain weirdness at the level of basic numbers and descriptions of procedures. Some of my points were rather straightforward, but some may need further explanation or documentation. That is why I have provided this auxiliary blog.

The numbers below correspond to footnotes embedded in the text of the Mind the Brain blog post.

1. Including or excluding one or two participants can change results.

Many of the analyses depended on correlation coefficients. For a sample of 23, a correlation of .41 is required for a significance of .05. To get a sense of how adding or leaving out a few subjects can affect results, look at the scatterplots below.

The first has 28 data points and a correlation of -.272.  The second plot has added in three data points which were not particularly outliers, and the correlation jumped to -.454.

slie r .27 Part 2 Groningen_Basic statistics plus

slide 1 Groningen_Basic statistics plus

 

 

 

 

 

2. There is some evidence this could have occurred after initial results were known.

The article notes:

 A total of 35 couples completed the 1.5 hour pre-therapy fMRI scan. Over the course of therapy, 5 couples either became pregnant, started taking medication, or revealed a history of trauma which made them no longer eligible for the study. Four couples dropped out of therapy and therefore did not complete the post EFT scan, two couples were dropped for missing data, and one other was dropped whose overall threat-related brain activation in a variety of regions was an extreme a statistical outlier (e.g., greater than three standard deviations below the average of the rest of the sample).

I am particularly interested in the women who revealed a history of trauma after the initial fMRI. When did they reveal it? Did disclosure occur in the course of therapy?

If the experiment had the rigor of a clinical trial as the authors claim, results for all couples would be retained, analogous to what is termed an “intention-to-treat analysis.”

There are clinical trials that started with more patients per cell and dropping or retaining just a few patients affected the overall significance of results. Notable examples are Fawzy et al. who turned a null trial into a positive one by dropping three patients and Classen et al in which results of a trial with 353 participants are significant or not, depending on whether one patient is excluded.

3. Any positive significant findings are likely to be false, and of necessity, significant findings will be large in magnitude, even when false positives.

A good discussion of the likelihood that significant findings from underpowered trials are likely to be false can be found here. Findings from small numbers of participants that are significant are larger, because larger effect sizes are required for significance.

4. They stated that they recruited couples with the criteria that their marital dissatisfaction initially be between 80-96 on the DAS. They then report that initial mean DAS score was 81.2 (SD=14.0). Impossible.

Couples with mild to moderate marital distress are quite common in the general population to which advertisements were directed. It statistically improbable that they recruited from such a pool and obtained a mean score of 81.2. Furthermore, with a lower bound of 80, it makes no sense that if the mean score was 81.2, there would be a standard deviation of 14. This is overall a very weird distribution if we accept what they say.

5. The amount of therapy that these wives received (M= 22-9, range =13-35) was substantially more what was provided in past EFT outcome studies. Whatever therapeutic gains were observed in the sample could not be expected to generalize to past studies.

Past outcome studies of EFT have provided 8 to 12 sessions of EFT with one small dissertation study providing 15 sessions.

6. The average couple finishing the study still qualified for entering it.

Mean DAS scores after EFT was declared completed were 96.0 (SD =17.2). In order to enroll in the study, couples had to have DAS scores 97 or less.

7. No theoretical or clinical rationale is given for not studying husbands or presenting their data as well.

Jim Coan’s video presentation suggests that he was inspired to do this line of research by observing how a man in individual psychotherapy for PTSD was soothed by his wife in the therapy sessions after the man requested that she be present. There is nothing in the promotional materials associated with either the original Coan study or the present one to indicate that fMRI would be limited to wives.

Again, if the studies really had the rigor of a clinical trial as claimed by the authors, the exclusive focus on wives versus husbands’ fMRI would have been pre-specified in the registration of the study. There are no registrations to the studies.

8. The size of many differences between results characterized as significant versus nonsignificant is not itself statistically significant.

With a sample size of 23, let’s take a correlation coefficient of .40, which just misses statistical significance. A correlation of .80 (p < .001) is required to be statistically more significant than .40 (p > .05). So, many “statistically significant findings” are not significantly larger than correlations that were ignored as not significant. This highlights the absurdity of simply tallying up differences that reach the threshold of significance, particularly when no confidence intervals are provided.

9. The graphic representations in Figures 2 and 4 were produced by throwing away two thirds of the available data.

standard deviations and normal distributionAs seen in the bell curve to the left, 68.2% (or ~16/23) of the women fall between the excluded -1. to + 1.0 SD.

 

Throwing away the data for 16 women leaves with 7. These were distributed across the four lines in Figures 2 and 4, one or two to a line. Funky?  yup.

j figure 2.pone.0079314.g002

What We Need to Do to Redeem Psychotherapy Research

[Update] This blog post has now been expanded into an article in Journal of Evidence-Based Psychotherapies that can be downloaded at ResearchGate. Commentaries on this article by E. David  Klonsky and Bruce Thyer are available at the journal site.

This post serves as a supplement to one in PLOS Mind the Brain, Salvaging Psychotherapy Research: a Manifesto. The Mind the Brain post declares

We need to shift the culture of doing and reporting psychotherapy research. We need to shift from praising exaggerated claims about treatment and faux evidence generated to promote opportunities for therapists and their professional organizations.  Instead, it is much more praiseworthy to provide  robust, sustainable, even if more modest claims and to call out hype and hokum in ways that preserve the credibility of psychotherapy.

The current post provides documentation in the form of citations and further links for the points made there concerning the need to reform the psychotherapy research literature.

Many studies considered positive, including those that become highly cited, are basically null trials.

spin noTwo examples of null trials became highly cited because of spin.

Bach, P., & Hayes, S. C. (2002). The use of acceptance and commitment therapy to prevent the rehospitalization of psychotic patients: a randomized controlled trial. Journal of consulting and clinical psychology, 70(5), 1129.

Discussed in these blog posts:

More on the Acceptance and Commitment Therapy Intervention That Failed to Reduce Re-Hospitalization.

Study Did Not Show That Brief Therapy Kept Psychotic Patients Out of Hospital

Here is another trial spun and dressed up:

Dimidjian, S., Hollon, S. D., Dobson, K. S., Schmaling, K. B., Kohlenberg, R. J., Addis, M. E., … & Jacobson, N. S. (2006). Randomized trial of behavioral activation, cognitive therapy, and antidepressant medication in the acute treatment of adults with major depression. Journal of consulting and clinical psychology, 74(4), 658.

The Dimidjian et al trial launched interest in behavior activation as a Third Wavelipstick-pig psychotherapy and has been cited times, almost always uncritically. What could possibly be wrong with the study? I will have to blog about that sometime, but check it out, using the link to the PDF that I provided. Hint: whatever happened to the obviously missing presentation for main time x treatment interactions for the primary outcome? What was emphasized instead and why?

positive spin 2Spin starts in abstracts

Discussed in a pair of blog posts :

Investigating the Accuracy of Abstracts: An Introduction

Dissecting a Misleading Abstract

When controls are introduced for risk of bias or investigator allegiance, affects greatly diminish or even disappear.

An example is

Jauhar, S., McKenna, P. J., Radua, J., Fung, E., Salvador, R., & Laws, K. R. (2014). Cognitive-behavioural therapy for the symptoms of schizophrenia: systematic review and meta-analysis with examination of potential bias. The British Journal of Psychiatry, 204(1), 20-29.

For an interesting discussion of how much meta-analyses of the same literature can vary in conclusions whether or not risk of bias and investigator allegiance are taken into account, see

Meta-Matic: Meta-Analyses of CBT for Psychosis

Conflicts of interest associated with authors having substantial financial benefits at stake are rarely disclosed in the studies that are reviewed or the meta-analyses themselves.

I recently blogged about these two articles

Sanders, M. R., Kirby, J. N., Tellegen, C. L., & Day, J. J. (2014). The Triple P-Positive Parenting Program: A systematic review and meta-analysis of a multi-level system of parenting support. Clinical psychology review, 34(4), 337-357.

and

Sanders, M. R., & Kirby, J. N. (2014). Surviving or Thriving: Quality Assurance Mechanisms to Promote Innovation in the Development of Evidence-Based Parenting Interventions. Prevention Science, 1-11.

Here

Critical analysis of a meta-analysis of a treatment by authors with financial interests at stake

Here

Are meta-analyses done by promoters of psychological treatments as tainted as those done by Pharma?

And here.

Sweetheart relationship between Triple P Parenting and the journal Prevention Science?

There are low thresholds for professional groups such as the American Psychological Association Division 12 and governmental organizations such as the US Substance Abuse and Mental Health Services Administration (SAMHSA) declaring treatments to be “evidence-supported.”

I blogged about this.

Troubles in the Branding of Psychotherapies as “Evidence Supported”

Professional groups have conflicts of interest in wanting their members to be able to claim the treatments they practice are evidence-supported.

I have blogged about the Society for Behavioral Medicine a number of times

Faux Evidence-Based Behavioral Medicine at Its Worst (Part I)

Faux Evidence-Based Behavioral Medicine Part 2

Does psychotherapy work for depressive symptoms in cancer patients?

Some studies find differences between two active, credible treatments

This does not happen very often, but here is such a study:

Poulsen, S., Lunn, S., Daniel, S. I., Folke, S., Mathiesen, B. B., Katznelson, H., & Fairburn, C. G. (2014). A randomized controlled trial of psychoanalytic psychotherapy or cognitive-behavioral therapy for bulimia nervosa. American Journal of Psychiatry, 171(1), 109-116.

that I blogged about

When Less is More: Cognitive Behavior Therapy vs Psychoanalysis for Bulimia

Bogus and unproven treatments are promoted with pseudoscientific claims.

Here is a website offering APA approved continuing education credit for Somatic Experiencing

Somatic Experiencing® is a short-term naturalistic approach to the resolution and healing of trauma developed by Dr. Peter Levine. It is based upon the observation that wild prey animals, though threatened routinely, are rarely traumatized. Animals in the wild utilize innate mechanisms to regulate and discharge the high levels of energy arousal associated with defensive survival behaviors. These mechanisms provide animals with a built-in ‘’immunity’’ to trauma that enables them to return to normal in the aftermath of highly ‘’charged’’ life-threatening experiences.

Declarations of conflicts of interest are rare and exposure of authors who routinely failed to disclose conflicts of interest is even rarer.

I blogged about this:

Sweetheart relationship between Triple P Parenting and the journal Prevention Science?

Departures from preregistered protocols in published reports of RCTs are common, and there is little checking of discrepancies in abstracts from results that were actually obtained or promised in preregistration.

Here is a notable recent example about which I blogged here, here and here.

Morrison, A. P., Turkington, D., Pyle, M., Spencer, H., Brabban, A., Dunn, G., … & Hutton, P. (2014). Cognitive therapy for people with schizophrenia spectrum disorders not taking antipsychotic drugs: a single-blind randomised controlled trial. The Lancet, 383(9926), 1395-1403.

See also

Milette, K., Roseman, M., & Thombs, B. D. (2011). Transparency of outcome reporting and trial registration of randomized controlled trials in top psychosomatic and behavioral health journals: a systematic review. Journal of psychosomatic research, 70(3), 205-217.

Specific journals are reluctant to publish criticism of their publishing practices.

Cook JM, Palmer S, Hoffman K, Coyne JC. Evaluation of clinical trials appearing in Journal of Consulting and Clinical Psychology: CONSORT and beyond. The Scientific Review of Mental Health Practice. 2007;5,69-80.

This article attempted to point out shortcomings in the reporting of clinical trials In Journal of Consulting and Clinical Psychology and was first submitted there and rejected. As you can see, we in no way intended to bash the journal, but to highlight the need for adopting and enforcing CONSORT. The article was sent out to review to two former editors who understandably took issue with its depiction of the quality of clinical trials they had accepted for publication. Fortunately, we were able to publish this article elsewhere when JCCP rejected it.

Those of us around on the listservs in the early 2000s can recall how aggressively APA resisted adoption of CONSORT. Finally APA relented with

Guidelines seek to prevent bias in reporting of randomized trials

But it contained an escape clause. All authors had to do was fail to declare their study was a RCT. But making that disclosure is part of adhering to CONSORT!.

Authors of APA journal articles who call a clinical trial a “randomized controlled trial” (RCT) are now required to meet the basic standards and principles outlined in the Consolidated Standards of Reporting Trials (CONSORT) guidelines as part of an effort to improve clarity, accuracy and fairness in the reporting of research methodology.

We complained and the escape clause was eliminated, even if enforcement of CONSORT remained spotty.

Coyne, J. C., Cook, J. M., Palmer, S. C., & Rusiewicz, A. (2004). Clarification of clinical trial standards. Monitor on Psychology: A Publication of the American Psychological Association, 35(11), 4-8.

If a title or abstract of a paper reporting a RCT does not explicitly state “randomized clinical trial,” there is risk it will be lost in any initial search of the literature. We propose that if editors and reviewers recognize that a study reports a randomized clinical trial, they will require that authors label it as such and that they respond to the CONSORT checklist.

 

No more should underpowered in exploratory pilot feasibility studies be passed off as RCTs when they achieve positive results.

An excellent discussion of this issue can be found in

Kraemer, H. C., Mintz, J., Noda, A., Tinklenberg, J., & Yesavage, J. A. (2006). Caution regarding the use of pilot studies to guide power calculations for study proposals. Archives of General Psychiatry, 63(5), 484-489.

And

Leon, A. C., Davis, L. L., & Kraemer, H. C. (2011). The role and interpretation of pilot studies in clinical research. Journal of psychiatric research, 45(5), 626-629.

Evaluations of treatment effects should take into account prior probabilities suggested by the larger literature concerning comparisons between two active, credible treatments. The well-studied treatment of depression literature suggests some parameters.

Cuijpers, P., & van Straten, A. (2011). New psychotherapies for mood and anxiety disorders: necessary innovation or waste of resources?. Canadian journal of psychiatry. Revue canadienne de psychiatrie, 56(4), 251.

On the one hand, there is a clear need for better treatments, as mood and anxiety disorders constitute a considerable burden for patients and society. Further, modelling studies have shown that current treatments can reduce only one-third of the disease burden of depression and less than one-half of anxiety disorders, even in optimal conditions.2

However, there are already dozens of different types of psychotherapy for mood and anxiety disorders, and there is very little evidence that the effects of treatments differ significantly from each other. In depression, we found that interpersonal psychotherapy is somewhat more effective than other therapies,3 but differences were very small (Cohen’s d < 0.21) and the clinical relevance is not clear. In the field of anxiety disorders, there is evidence that relaxation is less effective than cognitive-behavioural therapy, but there is very little evidence for significant differences between other therapies.

We think that new therapies are only needed if the additional effect compared with existing therapies is at least d = 0.20. Larger effect sizes are not reasonable to expect as 0.20 is the largest difference between therapies found until now. Further, this effect needs to be empirically demonstrated in high-quality trials.

However, to show such an effect of 0.20 we would need huge numbers. A simple power calculation shows that this would require a trial of about 1000 participants (STATA[Statacorp, College Station, TX] sampsi command). As a comparison, the large National Institute of Mental Health Treatment of Depression Collaborative Trial examining the effects of treatments of depression included only 250 patients.

 

Adverse events and harms should routinely be reported.

Vaughan, B., Goldstein, M. H., Alikakos, M., Cohen, L. J., & Serby, M. J. (2014). Frequency of reporting of adverse events in randomized controlled trials of psychotherapy vs. psychopharmacotherapy. Comprehensive Psychiatry.

Meta-analyses of psychotherapy should incorporate p-hacking techniques.

This is discussed in

Lakens, D., & Evers, E. R. (2014). Sailing From the Seas of Chaos Into the Corridor of Stability Practical Recommendations to Increase the Informational Value of Studies. Perspectives on Psychological Science, 9(3), 278-292.

evidence based