Sample means, how do they work?

You know how people make public health decisions about food fortification, and medical decisions about taking supplements, based on things like the Recommended Daily Allowance?

Well, there's an article in Nutrients titled A Statistical Error in the Estimation of the Recommended Dietary Allowance for Vitamin D. This paper says the following about the info used to establish the US recommended daily allowance for vitamin D:

The correct interpretation of the lower prediction limit is that 97.5% of study averages are predicted to have values exceeding this limit. This is essentially different from the IOM’s conclusion that 97.5% of individuals will have values exceeding the lower prediction limit.

The whole point of looking at averages is that individuals vary a lot due to a bunch of random stuff, but if you take an average of a lot of individuals, that cancels out most of the noise, so the average varies hardly at all. How much variation there is from individual to individual determines the population variance. How much variation you'd expect in your average due to statistical noise from sample to sample determines what we call the variation of the sample mean.

When you look at frequentist statistical confidence intervals, they are generally expressing how big the ordinary range of variation is for your average. For instance, 90% of the time, your average will not be farther off from the "true" average than it is from the boundaries of your confidence interval. This is relevant for answering questions like, "does this trend look a lot bigger than you'd expect from random chance?" The whole point of looking at large samples is that the errors have a chance to cancel out, leading to a very small random variation in the mean, relative to the variation in the population. This allows us to be confident that even fairly small differences in the mean are unlikely to be due to random noise.

The error here, was taking the statistical properties of the mean, and assuming that they applied to the population. In particular, the IOM looked at the dose-response curve for vitamin D, and came up with a distribution for the average response to vitamin D dosage. Based on their data, if you did another study like theirs on new data, it ought to predict that 600 IU of vitamin D is enough for the average person 97.5% of the time.

They concluded from this that 97.5% of people get enough vitamin D from 600 IU.

This is not an arcane detail. This is confusing the attributes of a population, with the attributes of an average. This is bad. This is real, real bad. In any sane world, this is mathematical statistics 101 stuff. I can imagine that someone who's heard about a margin of error a lot doesn't understand this stuff, but anyone who has to actually use the term should understand this.

Political polling is a simple example. Let's say that a poll shows 48% of Americans voting for the Republican and 52% for the Democrat, with a 5% margin of error. This means that 95% of polls like this one are expected to have an average within 5 percentage points of the true average. This does not mean that 95% of individual Americans have somewhere between a 43% and 53% chance of voting for the Republican. Most of them are almost definitively decided on one candidate, or the other. The average does not behave the same as the population. That's how fundamental this error is – it's like saying that all voters are undecided because the population is split.

Remember the famous joke about how the average family has two and a half kids? It's a joke because no one actually has two and a half kids. That's how fundamental this error is – it's like saying that there are people who have an extra half child hopping around.

And this error caused actual harm:

The public health and clinical implications of the miscalculated RDA for vitamin D are serious. With the current recommendation of 600 IU, bone health objectives and disease and injury prevention targets will not be met. This became apparent in two studies conducted in Canada where, because of the Northern latitude, cutaneous vitamin D synthesis is limited and where diets contribute an estimated 232 IU of vitamin D per day. One study estimated that despite Vitamin D supplementation with 400 IU or more (including dietary intake that is a total intake of 632 IU or more) 10% of participants had values of less than 50 nmol/L. The second study reported serum 25(OH)D levels of less than 50 nmol/L for 15% of participants who reported supplementation with vitamin D. If the RDA had been adequate, these percentages should not have exceeded 2.5%. Herewith these studies show that the current public health target is not being met.

Actual people probably got hurt because of this. Some likely died.

This is also an example of scientific journals serving their intended purpose of pointing out errors, but it should never have gotten this far. This is a send a coal-burning engine under the control of a drunk engineer into the Taggart tunnel when the ventilation and signals are broken level of negligence. I think of the people using numbers as the reliable ones, but that's not actually enough – you have to think with them, you have to be trying to get the right answer, you have to understand what the numbers mean.

I can imagine making this mistake in school, when it's low stakes. I can imagine making this mistake on my blog. I can imagine making this mistake at work if I'm far behind on sleep and on a very tight deadline. But if I were setting public health policy? If I were setting the official RDA? I'd try to make sure I was right. And I'd ask the best quantitative thinkers I know to check my numbers.

The article was published in 2014, and as far as I can tell, as of the publication of this blog post, the RDA is unchanged.

And I don't wanna talk to a scientist
Y'all motherfuckers lying, and getting me pissed

-Insane Clown Posse

(Cross-posted to LessWrong.)

10 thoughts on “Sample means, how do they work?”

Romeo Stevens November 19, 2016 at 3:36 pm

When the RDA was updated for sodium the IoM produced a report on the evidence behind current recommendations, and noted that the FDA asked them to write the bottom line first (find evidence that a reduction would be good). I wish I was joking, and I also wish people were as horrified by this as they should be. Despite this, the IoM was unable to find evidence that a reduction would be good, and a look at the data very strongly suggests that the current recommendations (1500mg) are causing tons of hospitalizations in seniors put on a low salt diet for high blood pressure (controlling for the reduction in hospitalizations for BP from sodium restriction). Many of them through passing out and falling, which is a big mortality correlate (it gets trickier than actual causation since so many hospitalizations of the elderly have proximate causes followed by complications leading to death.)

Reply ↓

Nick T November 20, 2016 at 2:56 pm

Do you have any idea what the FDA's motivation was for that bottom line?

Reply ↓
1. jimrandomh November 20, 2016 at 4:07 pm
  
  Do you have a link to the report and comment? That would be a very good thing to be able to show people.
  
  Reply ↓
  1. vaniver November 21, 2016 at 1:24 pm
    
    The relevant report is probably this one from 2005, but I didn't see that comment in a quick skim of the chapter. https://www.nap.edu/read/10925/chapter/8
    
    Might have been pushed to an appendix or something.
    
    Reply ↓
  2. Romeo Stevens November 21, 2016 at 7:45 pm
    
    This is a summary, having a hard time finding the original systematic review the IoM generated. Will keep looking.
    https://www.nationalacademies.org/hmd/~/media/Files/Report%20Files/2013/Sodium-Intake-Populations/SodiumIntakeinPopulations_RB.pdf
    
    Reply ↓
    1. Romeo Stevens November 21, 2016 at 7:52 pm
      
      I'm actually finding this weird. I distinctly remember a report from the IoM that stated that they were unable to find the evidence for <2400mg/day being helpful and that after back and forth with the FDA on the issue the IoM said fine and lowered the UL to 2300mg but reiterated that there was no evidence that 1500mg was good. And now I can't find it.
      The link Vaniver posts seems to be an updated version that looks significantly different from what I originally read.
    2. Romeo Stevens November 21, 2016 at 7:56 pm
      
      and all references to the old recommended upper limit of 2400mg are missing or seem to have been replaced with 2300.

Julia K. November 20, 2016 at 5:10 pm

Like a significant minority of the population, I have genetic defects in my vitamin D receptors.

I work outdoors in Texas (albeit with sunscreen and a hat), and I have light skin, so I get plenty of sun exposure. Yet I still have to take 5,000 IU of Vitamin D3 daily in order to get up to the normal range on serum 25(OH)D tests, which I get checked once or twice a year (since D is fat-soluble, it is possible to be harmed by too much, unlike water-soluble vitamins like C).

Individuals differ from the average. Individuals differ from the average. Individuals differ from the average. Often, it's for genetic reasons that are now easy to test for and compensate for with supplements.

Reply ↓

Gary November 20, 2016 at 9:30 pm

Scary and sad. I'm surprised no one has written some NLP algos to crawl papers and look for these sorts of errors

Reply ↓

Benquo Post authorNovember 21, 2016 at 9:43 am

You're somebody!

Reply ↓

Compass Rose

The territory is a map of the map.

Sample means, how do they work?

Related

10 thoughts on “Sample means, how do they work?”

Leave a Reply Cancel reply

Share this:

Related

10 thoughts on “Sample means, how do they work?”

Leave a Reply Cancel reply

Discover more from Compass Rose