In the early days of the pandemic, there wasn't great data available, and it wasn't easy to do better than trusting the standard epidemiological estimate that around 2% of people who got COVID-19 would die. My back of the envelope estimate at the time was way higher, but no one else I knew seemed to think that number made sense, so I let the matter drop. But now we have enough data to check.
Recently, my sister reached out to me to check her own thinking on the matter. She used the same method I initially did - simply dividing the number of deaths by the number of resolved cases (deaths + recoveries) - to estimate that in the US, COVID-19 kills around 1 in 6 people who get it.
The problem with using only resolved cases, in a country with an ongoing pandemic, is that if people die faster than they're marked recovered, death rates can be inflated - and if they recover faster, deflated. Ideally, you'd want to wait until all cases have been resolved one way or the other. Fortunately, there are now countries where that situation nearly holds.
ETA: The other problem is that cases aren't infections. But if, as I do, you want to use publicly reported case data to make informed personal decisions, you might also be interested in the easier to calculate expected deaths per reported case.
I looked at countries with the 25 lowest active case counts, using the 91-DIVOC visualization tool, to see which ones seemed to be mostly done with the pandemic, at least for now:
I copied the current numbers (as of 7 Jun 2020) into a spreadsheet, to see what the effective death rate is in countries where the vast majority of cases are resolved. For the People's Republic of China, I only looked at numbers from Hubei.
In the countries and regions I looked at, active cases are around 1% of all confirmed cases, and around 5.8% of resolved COVID-19 cases end in death. But two thirds of cases came from Hubei. Excluding China, around 4% of confirmed cases are active, and around 3.5% of resolved cases ended in death. If I only look at other countries where fewer than 1% of confirmed cases are active, the average COVID-19 death rate is 2.6%, though in individual countries it ranges from 0.55% for Iceland to 4.67% for Croatia.
I'm adjusting my freak-out-and-hole-up threshold accordingly.
UPDATE: In the comments, Anna Salamon shared this estimate of the actual infection fatality rate for the state of New York.
Infection fatality rate (IFR) is clearly the superior metric if you're trying to do something like forecast the spread of the virus, and total death counts, because it corresponds more directly to a statement about underlying reality than the case fatality rate (CFR) does; the denominator of CFR is determined in part by who gets tested.
But if you're trying to figure out what rough-and-ready multiplier to apply to the daily numbers reported in your area, then to use IFR estimates, you need to remember that reported cases are not the same as actual infections, and adjust accordingly.
Do we know the definition of a "case"? Is it the same across countries? Do we know what fraction of infections result in cases? This last should be determinable by looking only at infections discovered via trace-and-test, but I don't know where to find that data, and it'll be vulnerable to varying definitions of case.
I agree, if many (most?) infections are relatively asymptomatic and undiagnosed, that's going to skew the reported cases toward the severe ones and inflate the case fatality rate.
Perhaps it would be helpful to look at serological immunity testing for an area and compare that to deaths, adjusting for some of the same timing issues.
Do you know of reported data for that?
This is a big problem, and one I don't know how to solve. My best guess is that the US isn't a strong outlier here, so that we should expect broadly similar fatalities per case, even if it's unclear how to interpret cases per se. Hawaii, which seems to be in a similar position to the countries I looked at, has a case fatality rate of around 2.5%, which is about in line.
Thanks for this data. An estimate that uses antibody testing + total excess mortality in NYC to estimate IFR gets an estimated IFR of 1.4%. I'm curious what your thoughts on it are. https://www.worldometers.info/coronavirus/coronavirus-death-rate/ I'm currently inclined to trust that more than the above case fatality rates, due to expecting undercounting of lower-severity cases in official case tracking, but I found your analysis helpful anyhow, and would be interested to know if you think I'm missing something important here.
Thanks for the link! It seems like a good first shot at estimating the NY IFR. In the absence of widespread frequent serological testing on random samples I'm not sure how to use the IFR to make decisions (whereas case rates are reported daily).
Some ways that model could be improved would be with some model of how representative the sample is; if you go find people at grocery stores, you're eliminating both anyone who's too ill to be out and about, and anyone who's holed up at home or outside the city. My guess is that the latter factor matters more, but doesn't cause the figure to be off by worse than a factor of 2. I have other little quibbles, but again, nothing that would cause this to be off by worse than a factor of 2, and not all pointing in the same direction.
If I were trying to do something fine-grained, what I'd do is:
(1) Calibrate daily reported data against these #s to get a more precise estimate of risk per reported daily case.
(2) Track and reconcile anomalies in daily case and death reports more broadly to get a handle on whether I should be using heterogeneous multipliers.
I think Zvi has done some of that, not sure how much made it into blog posts but I think a fair amount of it did.
Early estimates were an IFR of 1%, and subsequent estimates were basically the same. The scientists got this one right.
I have no idea why you would attempt to calculate your own estimates based on local data to decide whether or not to freak out.
Death rate for people who are hospitalised are only relevant if and when you become hospitalised. If you're currently not hospitalised, freaking out over the possibility of become hospitalised seems - silly. The IFR is the important one.