Checking graphs in articles: Binge drinking in women dramatically increasing, while binge drinking in men decreasing.

From 2002 to 2013, binge drinking decreased by 2% in men while it increased by 13% in women.

Huge differences in alcohol consumption by most people versus the upper 10%.

I wanted to blog about a recent article in The New York Times about binge drinking, but with a different focus than that of the author. Gabrielle Glaser is the author of the recent book Her best-kept Secret: why women drink – how they can regain control. She’s more interested in getting into issues of treatment, which is the focus of the book. I want to focus on a provocative chart the article, that probably underscores her some of her points in the article more than she emphasized.

A reminder to check tables and graphs in articles. Don’t just gloss over the valuable information they may display.

But before I could write my blog, I discovered that my Facebook friend Mike Miller was already discussing it on his personal page. So I just borrowed from that with attribution. Then I found that he had previously touched on this topic and so I reproduced it as well. I recommend becoming Mike Miller’s friend on Facebook. I found it worth it to have his posts displayed at the top when I go to my news feed.

The article

The New York Times article is

The graph

who is drinking

What Mike Miller has to say about the graph:

The headline doesn’t say it, but the drawing suggests it — the “people consuming more alcohol” are women. From 2002 to 2013, binge drinking decreased by 2% in men while it increased by 13% in women. I’ve been noticing a few alcohol marketing and advertising schemes targeting women recently. You might also have noticed that the booze industry has broken it’s decades-old promise not to advertise hard liquor on television. They started doing it a few years ago and they’ve been ramping up since then. They’ve been using product placement and movies like “Bad Moms” and sneaky ads like the one shown in my first comment. The usual message is that parenting is very stressful and alcohol is great for stress and good clean fun, so let lose and have a drink. I also see this way of thinking promoted on Facebook.

Mike’s current Facebook page directs us to an earlier comment that he made about a graph in another article

Mike Miller September 26, 2014 ·

The article in the Washington Post

Think you drink a lot? This chart will tell you

The graph

drink alot chart

Mike’s comment

This is nuts! Friends of mine in Epidemiology who studied alcohol consumption told me about this years ago — that most consumption is by alcoholics and the alcohol industry is very aware of it. So the chart shows that half of American adults are drinking an average of about 1.7 drinks per *year* while the upper 10% is drinking 7 times that much per *day*. It looks like about 75% of consumption is by alcoholics.

An appeal: Please don’t send your old statistics books to the landfill

I so admire the folks who throw so much energy into the supercourse, passing out candles to light the world and maybe save it from some bad epidemiology and statistics.

supercourse logo

Learn more about it here.

I am reproducing a recent appeal I received from them below

Statistics books are not fake news or alternate facts

However, they need to be used.

Friends we now have 40 people who have agreed to share their statistics books with the Library of Alexandria and developing countries. We would like to encourage you to especially tell your friends on the road to retirement, or who have retired.

Do you want your books to end up in a Landfill like this?


As I have aged I have collected over 2000 novels. I randomly selected 10 that I had not read in five years.  I could recall very little, I suspect that this might be the same with your statistical books.  There was a fabulous poem called forgetfulness by Billy Collins.  Please look at it and think that most books we read from over 10 years ago have fallen into the magic oblivion called “forgetfulness”. If we have forgotten them and will not use them again, why not help others to learn.

Yes as we age, we have forgotten much.  But we has teachers and research have not forgotten how to give back to our future generations of researchers.  Now with the BA Serageldin Euclid Library, we can “give back” to the future.

Boxing your statistics books is simple, you can hire your niece you will have to do this anyway when you retire.  The cardboard boxes are inexpensive, and shipping only costs ~$0.50 per book, and is probably tax deductible.

A day will come where your chairperson knock on your Door.

Dear Dr. LaPorte, we have just hired 2 assistant professors, and they will need your 4 x 5 meter office, you can sit in Room 287 with another retired professor in your 2 x 3 foot office.  What will you do with your books? Will it be the landfill above? Or will your books reside next to Euclid in the Library of Alexandria, below.

“No one has ever become poor by giving”  (A. Frank)

More information –

The procedure is simple, but we only are taking research methods (statistics, big data, epidemiology) books for the Serageldin Library of Alexandria. Math-stat books are fine, but not pure math, biochemistry…etc. Also, we are not collecting any journals.
1.  Identify books you would like to donate (both new, and books of the 20th century
2. Box them
3. Include in each box the name of the people donating, and a dedication if you would like
4.  Please send the books via media/ book rate to the warehouse
5. The window will be about 7 days for the warehouse. We will inform by newsletter.

Contact Us if you would like to be in our donors list.


Third Short Course on Strengthening Causal Inference in Behavioral Obesity Research

On-line Registration:
Held On: Mon 7/24/2017 – Fri 7/28/2017
Location: The University of Alabama at Birmingham

 “Health is a state of complete physical, mental, and social well-being and not merely the absence of disease or infirmity”. (WHO)

I blog at a number of different sites including Quick Thoughts, PLOS blog Mind the Brain, and occasionally Science-Based Medicine. To keep up on my writing and speaking engagements and to get advance notice of e-books and web-based courses, please sign up at

A study of adolescent depression in Lancet Psychiatry that simply doesn’t make sense

A study of adolescent depression in Lancet Psychiatry that simply doesn’t make sense.

  • Serious discrepancy between sample size and numbers appearing in tables.
  • Claim that teens who did not access services were 7x more likely to be depressed three years later is incorrect and apparently depends on misinterpreting odds ratio as relative risks.
  • Complex statistical analyses were inappropriate to modest sample size.
  • Where was the study statistician when this paper was submitted?
  • How were these problems missed by reviewers?

I’m uploading this post on PubPeer and I expect I will have some comments to add later.

Update: This post is getting some attention in the social media. Someone observed on Facebook that editors and reviewers mostly don’t bother to read tables. It was certainly true in this instance, but I suspect more generally true.

But on PubPeer, someone made a similar observation and noticed more irregularities in the tables and have appended their comment at the end. These irregularities further underscore the need for the journal to make some corrections.

Reduction in adolescent depression after contact with mental health services: a longitudinal cohort study in the UK Lancet Psychiatry

I saw problems right away when I examined the abstract and these were confirmed  when I looked at the tables.

What I found was a cohort of 1238 14-year-old adolescent in which it was intended to track reduction in depression following contact with mental health services.

The findings were improbable and even astonishing.


14-year-old adolescents who had contact with mental health services in the past year had a greater decrease in depressive symptoms than those without contact (adjusted coefficient −1·68, 95% CI −3·22 to −0·14; p=0·033). By age 17 years, the odds of reporting clinical depression were more than seven times higher in individuals without contact than in service users who had been similarly depressed at baseline (adjusted odds ratio 7·38, 1·73–31·50; p=0·0069).

“Seven times higher”?!!

I can certainly understand the authors’ advocacy of “improvement of access to adolescent mental health services” but it does not follow from appropriate analyses of data from such a small cohort.

The first and obvious concern is that there is unlikely a large enough number of 14-year-old adolescents depressed and contacting or not contacting mental-health services to generate reliable multivariate statistics.

The abstract claims that the effects is that that those who are depressed and did not had contact services were seven times more likely to be depressed at 17 than those who were depressed and had contact. This is both improbable and a misinterpretation of an odds ratio (not a  more easily interpretable relative risk, RR).

I start with the assumptions that

(1) Few depressed adolescents in the community have contact with formal mental health services;

(2) Much of the contact with mental health services is inadequate quality and intensity; and)

(3) To the extent to which their “depression” represents a recurrent episodic disorder, as major depression is, an episode of treatment is follow by relapse and recurrence of one or more episodes, for which services may not be obtained.

There was a gross discrepancy between number of adolescents (1238) in the cohort and the numbers reported in the tables. For instance, in table 1, the number of adolescents is variable, ranging up to 3302.

A sample size of 1238 may seem like a lot of teens.  It’s not large enough to generate and test explanations for effects of contact with mental health services, particularly when the general sample is drawn from the community these contacts are likely to be low enough not to be suitable for multivariate analyses using them as predictors with control of confounds

We need to recognize that in terms of statistical power, the number of events to explain, not the overall size of the sample is most important.

In the results section we we can readily see the expected reduction in numbers from sample size to number of adolescents:


Of the 1238 participants recruited, 1190 adolescents had data for T1 current mental disorder and past-year mental health service contact (appendix p 6). The number of respondents with complete data for all outcomes and covariates at all timepoints was 995 (84%) for T1, 778 (65%) for T2, and 806 (68%) for T3. 64 (5%) adolescents made past-year contact with mental health services; 126 (11%) had a current mental disorder. Among individuals with a disorder, 48 (38%) reported past-year service contact and 46 (96%) of these contacts were based on T1 past-year recall; 36 (84%) of 43 of these adolescents attended five or more sessions (n=5 had missing data for treatment length). In the disorder-and-services group (n=48), disorders were affective (n=16 [33%]), anxiety (n=10 [21%]), behavioural (n=25 [52%]), and other (n=5 [10%]); 14 (29%) of these participants had a comorbid K-SADS diagnosis (appendix p 9).

Statistical controls. The article reports a full range of statistical controls that might be appropriate for a larger sample, but not for predicting such a small number of events. Overfitting of multivariate equations is likely to produce spurious and inflated results.

Data were adjusted as follows: gender, sociodemographics (ethnic origin, Index of Multiple Deprivation, adolescent living with biological parents), environmental factors (number of stressful life events in the past year, current family dysfunction and friendships, any family-focused adversities by T1), and mental health factors (any past Schedule for Affective Disorders and Schizophrenia for School-Age Children diagnosis, any mental health services after T1, any emotional problems in a family member [past 3 years or present], current antisocial traits). Variables not included were any mental health service referral age 0–13 years (p=0·19 in base model) and pubertal status (not a true confounder as p>0·10 and ρ<0·10 with predictor). MFQ=Mood and Feelings Questionnaire. T1=timepoint 1 (age 14·5 years). T2=timepoint 2 (age 16 years). T3=timepoint 3 (age 17·5 years).

The article later states:

We included nine baseline covariates in the propensity score weighting (table 2). Propensity score weighted GLM revealed that among adolescents with a mental disorder, those without contact with mental health services at T1 had nearly four times the odds of being depressed by T3 compared with those in the disorder-and-services group (table 2). Inclusion of post-baseline confounding variables increased odds by more than five times, and in the common support sample, to more than seven times (table 2).

Again, there is a gross mismatch between heavy-duty statistical analyses and the modest sample to which they are applied. I don’t understand how a statistician would agree to provide these analyses. Perhaps that explains why odds ratios were misinterpreted as relative risks, a common problem when statisticians are not involved in interpreting data.

Overall, this study involved a wild mismatch between statistical analyses and sample size. Such a sample size does not even allow testing of the appropriateness of the assumptions that are being made.

In the present study, we considered mental health services from all sectors irrespective of treatment length, we multiply imputed missing data, used propensity score weighting to adjust for participants’ initial likelihood to access services, and data yielded clinically relevant results robust to a wide range of confounds. Contact with mental health services appeared to be of such value that after 3 years the levels of depressive symptoms of service users with a mental disorder were similar to those of unaffected individuals.

I conclude this article is quite misleading and should not remain uncorrected.

From PubPeer:

>>How were these problems missed by reviewers?
Because reviewers don’t read tables or statistics.

At first I thought that the sample sizes like the “imputed” 3302 might be due to counting each of three time points as a case. But then 3302 is too small (1238*3=3704) and 2469 is too big (778*3=2334).

Also, the breakdown of the numbers in the middle of Table 1 (“Categorical analysis of age”) doesn’t work. It’s OK for the right-hand side (2257+140+52=2469) but not the left (2965+202+126=3293, not 3302).

I don’t think the numbers in this study have any credibility at all until this table is sorted out.

ebook_mindfulness_345x550I will soon be offering e-books providing skeptical looks at mindfulness and positive psychology, as well as scientific writing courses on the web as I have been doing face-to-face for almost a decade.

Sign up at my new website to get advance notice of the forthcoming e-books and web courses, as well as upcoming blog posts at this and other blog sites.  Get advance notice of forthcoming e-books and web courses. Lots to see at