Estimating the infection and case fatality ratio for coronavirus disease (COVID-19) using age-adjusted data from the outbreak on the Diamond Princess cruise ship, February 2020

Adjusting for delay from confirmation to death, we estimated case and infection fatality ratios (CFR, IFR) for coronavirus disease (COVID-19) on the Diamond Princess ship as 2.6% (95% confidence interval (CI): 0.89–6.7) and 1.3% (95% CI: 0.38–3.6), respectively. Comparing deaths on board with expected deaths based on naive CFR estimates from China, we estimated CFR and IFR in China to be 1.2% (95% CI: 0.3–2.7) and 0.6% (95% CI: 0.2–1.3), respectively.

After performing the sensitivty analysis exercise, we find that the confidence intervals originally determined by a 95% exact binomial test (calculated using the number of cases and the "known outcomes" quantity derived in the correction) widen slightly at the top end only. We therefore report the wider CI interval and use it in any subsequent calculation.

Non-truncated distribution
When fitting the hospitalisation-to-death distribution to data, Linton et al. performed some analysis which accounted for right-truncation of the data [1]. This truncated distribution is most likely a more accurate estimate of the true distribution, which is why it was used in the analysis reported in the main text. However, for completeness, we present the difference in the two distributions here [ Figure S1] and the effect the difference in distribution has on the results of the cCFR and cIFR calculated on the cruise ship [ Table S1]. As was to be expected, a shorter mean delay produces fewer corrected for "known outcomes", meaning that the correction doesn't increase the naïve estimate by as much. Therefore, using the truncated distribution (with a higher mean and standard deviation) in the calculation results in higher values for the cCFR and cIFR.

Indirect-standardisation
We standardise the age-stratified estimates for the CFR in China using what we believe to be an estimate of the effect the many biases present in such a value if it is estimated during an on-going outbreak. Arguably the largest such bias is the underreporting of cases, which is inevitable in a country with an overwhelmed healthcare. To this end, we treat the ratio between the CFR calculated based on the observed number of deaths on the Diamond Princess cruise ship and the expected number (if the nCFR in China had been true on the cruise ship) as the scaling factor by which we adjust the China data. In doing so, we are able to use all of the information in their age-stratified data with a high sample size along with the information in our CFR estimates calculated in a setting with no underreporting bias.

Limitations
The assumption that the delay between hospitalisation-to-death is equivalent to the delay between confirmation-to-death was undertaken as the data was reported by date of confirmation [1]. However, we implicitly test how sensitive our estimates are to this assumption by bootstrapping over the uncertainty range given in Linton et al. [1] and by calculating the estimates using both the truncated and non-truncated distributions. This is an inexact and indirect way to test the sensitivity of the estimates against this assumption. However, it is clear it makes very little difference (at the two significant figures level of precision reported) to the estimates.

More recent data
We performed the analysis based on the data at the time of 5 th March, using primarily the data in [2,3], as we required symptomatic/asymptomatic level data to estimate the IFR as well as the CFR. At this point, there has been a total of 7 deaths and 634 cases, of which ew have the breakdown of syptomatics to asyptomatics for 619 cases [2,3]. Since there have been a further two deaths and 62 more cases. We are entering the phase where the correction does not need to correct for much, given that most outcomes are now known. This gives us a unique opportunity to test our corrected value against the naïve calculation, which after enough time should converge. We find that using the current available data, the nIFR: 1.6% (CI 95%: 0.79%-2.8%), which is consistent with our more accurate estimate calculated using the truncated distribution for all ages.