Bad stats, non sequitur conclusions in Lancet chronic fatigue syndrome/suicide study

JiggoryInterpretive jiggery-pokery in The Lancet

A tale of a convenience sample with inconvenient serious limitations.

 

 

 

I would have dismissed this study  in a brief screen, except that it appeared in The Lancet.

Roberts E, Wessely S, Chalder T, Chang CK, Hotopf M. Mortality of people with chronic fatigue syndrome: a retrospective cohort study in England and Wales from the South London and Maudsley NHS Foundation Trust Biomedical Research Centre (SLaM BRC) Clinical Record Interactive Search (CRIS) Register. The Lancet. 2016 Feb 10.

The study has too small a sample to warrant a manuscript submitted to a peer-reviewed journal. If you keep looking at it, its problems get compounded.

I recommend downloading this open access article and following me as I explained its flaws.

What, too small a sample? With 2147 patients, the study is probably the largest ever of mortality among patients with chronic fatigue syndrome. But the adequacy of sample size is determined not by the total number of participants, but the number having particular events, in this case, the relatively rare events of mortality and suicide. In the seven-year follow-up, there were 17 instances of all-cause mortality, 11 among women and six among men. There were five suicides, three among women and two among men.

In order to assemble a sample of 2147 patients from an existing data set, the authors adopted relaxed diagnostic criteria:

We adopted the most inclusive criteria, and thus included all patients with a clinical diagnosis of chronic fatigue syndrome. A subsample of 755 patients had full diagnostic criteria applied prospectively of which 65% met Oxford criteria, 58% the 1994 case definition criteria, and 88% NICE criteria. All patients in this sample met at least one criterion.

Getting loose about diagnosis allowed the authors to assemble the largest possible sample, but the strategy created a host of other problems. They created a mixed (heterogeneous) sample of patients. Any generalization about the full sample might not apply to patients meeting a particular criterion. Allowing entry into the sample based on multiple criteria risks that the overall patient sample had considerable dissimilarities, and they might be similar in ways that the authors wouldn’t have wanted, i.e., they shared confounding variables. Note that two thirds met the Oxford criteria, which are discredited in much of the Western world. These criteria allow patients to be included with psychiatric comorbidity that would be excluded by other criteria for chronic fatigue syndrome that were used in the study.

When it comes to talking about suicide, psychiatric disorder, including major depression, is a robust major predictor and a major confound when examining other predictors, even if most patients with major depression do not commit suicide.

But here’s the rub: some patients with major depression were entered into the study because they met the Oxford criteria, whereas other potential patients were excluded from the study because the criteria applied to them did not allow psychiatric comorbidity. Hmm, we have a developing mess on our hands.

Now let’s consider how the small number of events to be explained, suicides, compounded the problem. Basically, we are sprinkling three women and two men across these different patient groups. Pure chance is at play, but if the authors misapply sophisticated statistics, they will capitalize on this chance.

Descriptive-observational samples like this one pose challenges to epidemiologists in confidently interpreting any associations that are identified. With a larger sample – i.e., larger number of suicides to explain – the authors might have used multivariate analyses with statistical control of possible confounds. For instance, it might be tempting to attempt to control for major depression. But that would involve inferences about what is going on among subgroupings of three women and two men with no possibility of deciding what is due to chance, i.e. spurious.

greetings_from_lala_land_where_fake_is_real_postcard-r85da847009b34976b7a8addfe3abfe77_vgbaq_8byvr_512Ignoring these obvious problems, the authors statistically controlled for age and sex. The authors didn’t have to control for race/ethnicity, because all of the participants in the study who died were whites. But I would make too much of race, given the small number of deaths and the small number of suicides. The authors also tried to break down (stratify) the sample according to whether participants who died had major depression and where they ranked on a measure of social deprivation. But, here again, we getting into lala land.

 

But the study gets worse with a closer look.

Compared to what? The authors wanted to make statements about relative mortality and suicides among patients with chronic fatigue syndrome. To do that, they did something very expedient, they created a ratio of deaths in this sample to deaths in England and Wales in 2011.

The denominator was the expected number of deaths, estimated by 5-year age bands, and sex-specific mortality rates for the England and Wales population in 2011 multiplied by the weighting of average person-years in the at-risk period experienced by chronic fatigue syndrome patients in each age and sex category.

The authors want to examine whether there is an excess of deaths and suicides among patients with chronic fatigue syndrome versus the general population. The obvious problem is that these patients may differ in other ways from the general population besides in chronic fatigue syndrome.

Let’s look at where these patients were recruited.

We investigated a retrospective cohort consisting of people diagnosed with chronic fatigue syndrome, using data from the national research and treatment service for chronic fatigue at the South London and Maudsley NHS Foundation Trust (SLaM) and King’s College London Hospital (KCH).

These are specialty settings and patients with chronic fatigue syndrome recruited from them may not be representative of the larger population of patients in the UK. In the discussion section, the authors concede this serious problem:

Because the referral pathway for this centre includes a full assessment including a psychiatric evaluation, an argument could be made that cases referred to the joint SLaM and KCH service may not be representative of chronic fatigue syndrome cases seen in secondary and tertiary care, and may include a referral bias, favouring patients with more severe chronic fatigue syndrome, psychiatric comorbidity, and higher socioeconomic status.

Actually, authors, I don’t see how anyone could argue against their being a strong referral bias in the sample, such that is unrepresentative of patients being seen in other settings.

Not surprising, with so few events to explain, the authors found no differences in all-cause of cancer specific deaths among patients with chronic fatigue syndrome. But they claimed to a found:

There was a significant increase in suicide-specific mortality (SMR 6·85, 95% CI 2·22–15·98; p=0·002).

Bingo! The article is saved from a predictable, all null findings by misapplication of multivariate statistics. If you believe the authors, patients with chronic fatigue syndrome over six times more likely to die by suicide, although the confidence interval stretch from 2.2 times to 16 times.

Then the qualification:

Although the suicide-specific SMR is raised compared with the general population, it is lower than for psychiatric disorders including affective disorders, personality disorders, and alcohol dependence reported in other population-based studies.

But to what population can these findings be generalized? Certainly not to patients with chronic fatigue syndrome drawn from low settings in the UK. Not to the United States, where the Oxford criteria are considered the least valid, in part because of the confounding with psychiatric disorder. The Oxford criteria even allow for psychiatric disorder to be the primary diagnosis, with chronic fatigue a secondary diagnosis.

The conclusion that the authors draw.

Although completed suicide was a rare event, the findings strengthen the case for robust psychiatric assessment by mental health professionals when managing individuals with chronic fatigue syndrome.

Ah yes, the article provides more evidence that mental health professionals should be overseeing the management of chronic fatigue syndrome and gives them new tasks for screening that might prevent the infrequent event of patient suicide. But the base rates found in this study don’t warrant formal screening efforts and I doubt that any evidence could be mustered that suicide would be reduced.

How did this article get published?

Certainly the article would not have been published in The Lancet if the authors had not had the gumption to submit it there. But do readers really believe that that if someone outside a tight circle of friends and family had submitted such an article to The Lancet, it would have been accepted?

Richard Horton, passing such a manuscript through to publication and attaching a commentary may demonstrate loyalty to your friends, but does nothing for the journal’s reputation or for the authors to have such embarrassingly bad science a matter of public record.

Unfortunately, I couldn’t find any information about the processing of this article from manuscript to final acceptance. Was it fast tracked? Certainly The Lancet found the article worthy of a brief commentary:

The risk of dying is increased in many illnesses, but the mortality associated with chronic fatigue syndrome is relatively unexplored. In The Lancet, Emmert Roberts and colleagues1 report results from a case register study that linked the clinical details of more than 2000 people with chronic fatigue syndrome presenting to a specialist clinic (in London and the south of England) with mortality outcomes over 7 years. This is the largest study of its type so far, and used a robust case definition.

“Robust case definition”? Give me a break!

plos oneI can’t speak for any of the other thousands of Academic Editors at PLOS One. The editors have an expressed commitment to publishing all articles that are not seriously flawed, so that post publication peer review can establish  articles’ importance and contribution to the field. But I would not have even sent this manuscript out for review. Its flaws are too obvious and unfixable, and the unnecessary burden on reviewers to waste their time figuring that out. There are just too many more promising manuscripts and too few reviewers to process them all.

 

 

Advertisements

13 thoughts on “Bad stats, non sequitur conclusions in Lancet chronic fatigue syndrome/suicide study

    • I certainly put more trust in Jason’s work than in Sir Simon’s. I wish we had easy access to the full text of the study you referenced. This would be a good time to compare it to the new study discussed above.

      Like

  1. “The authors wanted to make statements about relative mortality and suicides among patients with chronic fatigue syndrome.”

    Yes, this is exactly the problem. The Wessely School is running a propaganda campaign, not a research program. Their work consistently stinks of writing conclusions first then designing the study to produce the needed evidence. Even non-scientist patients can see the fraud, with a little help from troublemakers like Dr Coyne. (thank you Dr Coyne)

    It’s truly amazing to me how the pal review system and the journals can continue to allow this rubbish to be published, over and over, in spite of the developing scandal around the PACE trial.

    These publishers are so incompetent, they can’t see that publishing rubbish wrecks their reputations and hastens the end of their industry. At this point, they provide no value and are simply parasites sucking blood from researchers. They are running a straight-up protection racket with researchers depending on the “right” journal names in order to get grants.

    As a patient trying to understand my illness, I don’t read journals – I read PubMed. And whether a study has value or not, I’m not going to contact a journal. I’m going to discuss it on forums accessible to anyone, like this one.

    Liked by 3 people

  2. This kind of definitional gerrymandering goes on all the time in this field. It has utterly confounded research into this disease and created widespread misperception on the nature of the disease and how to treat it.

    For instance, in addition to using the absurdly broad Oxford, PACE also said that patients were characterized by the 1994 criteria. While the 2011 paper doesnt appear to note it, the 2013 recovery paper states that PACE modified the 1994 criteria to only require symptoms for one week instead of the 6 months required by the definition and acknowledged that their stated prevalence of patients meeting that criteria “may have been inaccurate.” But that point somehow doesnt get mentioned in the claims that the findings apply to patients diagnosed by the CDC CFS diagnosis. http://dx.doi.org/10.1017/S0033291713000020

    Another example is seen in Crawley’s recent study on the prevalence of “CFS” in 16 year olds. But as noted by disease experts in the comments on the published paper, what was actually studied and reported on was not chronic fatigue syndrome but chronic fatigue, based on parent reports and without medical evaluation. Unfortunately, Newsweek, BBC and other media missed that point when they published the story and announced “Chronic fatigue syndrome affects 1 in 50 teens,” a prevalence much higher than the most commonly accepted estimates.
    http://pediatrics.aappublications.org/content/early/2016/01/22/peds.2015-3434

    A third example is the conduct of evidence reviews. Take for instance, the recent Cochrane review of GET. The study acknowledged the range of definitions and therefore said that they would include any studies where patients met the following criteria: fatigue is prominent, is unexplained, is significantly disabling or distressing and has lasted for 6 months or more. http://dx.doi.org/10.1002/14651858.CD011040

    The recent evidence review by Health and Human Services in the U.S. took multiple definitions for ME and CFS and lumped them all together in their analysis solely on the basis of medically unexplained fatigue in spite of significant differences in inclusion and exclusion criteria. They noted the problems with Oxford and recommended that it stop being used but recommended treatments like CBT and GET, largely based on Oxford definition studies, including PACE. http://effectivehealthcare.ahrq.gov/ehc/products/586/2004/chronic-fatigue-report-141209.pdf

    Remarkably, one new U.S. clinical guideline has recommended the new U.S. Institute of Medicine (IOM) criteria for diagnosis but recommends PACE style CBT and GET, even though such treatment could cause harm to patients with the kind of systemic intolerance to even trivial activity that the IOM report said was the core feature of the disease.

    Just a few of many examples. Its remarkable that such definitional sloppiness has been allowed to continue for so many years. Its remarkable that “evidence-based medicine” is using the results from studies in one population of patients and applying them to patients meeting different criteria.

    Liked by 3 people

  3. I have some doubts that the sample contains patients with more severe chronic fatigue syndrome than the average.

    Patients visiting clinics are typically in the mild to moderate range. In part due to their physical limitations, in part due to them knowing the treatment approach will not be appropriate.

    Other research (by Jason if I remember correctly) shows that a patient sample selected according to wider criteria results in milder symptoms. Using Oxford and NICE definitions result in a sample with milder symptoms.

    Liked by 1 person

    • Should read: more affected patients don’t visit these clinics, in part due to their physical limitations, in part due to them knowing the treatment approach will not be appropriate.

      Like

  4. Thank you, James, for publishing this. I had already read the article and I had wondered what to make of it. It is very useful to have your critique. The more of this nonsense that you can explain, the better. Keep up the good work please, I find it very valuable..

    Liked by 1 person

  5. Not relevant to the study’s failures already identified but it is notable that the authors write:

    “The diagnosis was ascertained from having received the prespecified clinic code for a chronic fatigue syndrome diagnosis, which was the ICD-10 code for neurasthenia (F48.0)”

    So the treating clinic does not recognise ICD-10 G93.3 (Postviral fatigue syndrome/Benign myalgic encephalomyelitis/Chronic Fatigue Syndrome) which raises the question of what the authors actually consider CDC 1994(Fukada) to be, given that the relevant US diagnostic code for the Fukada criteria is ICD-10-CM R53.82 (previously ICD-9-CM 780.71) which specifically excludes neurasthenia as only a possibly “related” disorder.

    If in addition to the study’s reliance on the loose Oxford criteria, the clinic (and presumably the authors) is shoehorning both NICE and CDC criteria into a neurasthenia box, what confidence could anyone have that the study cohort represents anything that would be generally understood as a cohort of ME/CFS patients ?

    Liked by 1 person

  6. Thanks James for this demolition job. I heartily agree with all your analysis.
    I wonder if it occurred to any of the ‘researchers’ that if the patients concerned were being put through PACE style GET and/or CBT, the treatment itself, the relapses it caused, and the disrespect it implies for the reality of patients’ physical illness may have driven them to suicide….

    Like

  7. Poor sample choice, poor criteria, small numbers of actual points of interest, and over-emphasis of stretched statistical methods. So what’s new? I remember the study back in 2007 by Harvey, Wadsworth, Wessely and Hotopf, looking at 5362 patients from an NHS survey. From that they extracted 34 who “had” CFS/ME. Neither the number in this sample who had had no previous psychological problems, nor the number who had had minor psychological problems previously were unusual, but instead of there being 3 who had major psychiatric problems, there were 6. It was statistically significant. This led to a discussion and to the suggestion that “Those who report a diagnosis of CFS/ME have increased levels of psychiatric disorder prior to the onset of their fatigue symptoms.” They clearly felt that “The strengths of this study include its large size and prospective data collection.” They failed to understand that the size of their study sample was a mere 34 patients. But what is worse, oh so much worse, is that the diagnosis of CFS/ME was simply stated by the patient to be so: no attempt had been made to confirm or deny patients’ claims. So the whole assumption that prior psychiatric disorders played a significant part in the onset of CFS/ME rested on the word of 3 patients who had suffered previously from major psychiatric disorders, and who now reported that they had CFS/ME. The authors even said “Although the use of self-reported CFS/ME is the main limitation of this study, it may also provide some benefits.” and “Clinical experience suggests that it is uncommon for a patient to complain of CFS or ME and to not have sufficiently severe symptoms to warrant the diagnosis.”

    Of course the paper is lengthy, contains a wealth of references, terminology, statistical methods and elaborations, bringing in discussions about stress, and elements of personality. It looks impressive. To anyone lacking confidence in basic statistics, it would sound convincing. But, sadly, it is just typical of so many similar studies on CFS/ME.

    And these are the people who question my ability to analyse the data generated from the PACE study!

    Liked by 3 people

  8. I’m frankly amazed that anyone could write a commentary claiming the researchers used a “robust case definition” when the researchers themselves freely admit to using the loosest possible criteria in order to get the largest possible number of subjects!

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s