Predicting COVID-19 related death using the OpenSAFELY platform
The OpenSAFELY Collaborative None., Williamson E., Tazare J., Bhaskaran K., Bhaskaran K., McDonald H., Walker A., Tomlinson L., Wing K., Bacon S., Bates C., Curtis H., Forbes H., Minassian C., Morton C., Nightingale E., Mehrkar A., Evans D., Nicholson B., Leon D., Inglesby P., MacKenna B., Davies N., DeVito N., Drysdale H., Cockburn J., Hulme W., Morley J., Douglas I., Rentsch C., Mathur R., Wong A., Schultze A., Croker R., Parry J., Hester F., Harper S., Grieve R., Harrison D., Steyerberg E., Eggo R., Diaz-Ordaz K., Keogh R., Evans SJW., Smeeth L., Goldacre B.
Objectives To compare approaches for obtaining relative and absolute estimates of risk of 28-day COVID-19 mortality for adults in the general population of England in the context of changing levels of circulating infection. Design Three designs were compared. (A) case-cohort which does not explicitly account for the time-changing prevalence of COVID-19 infection, (B) 28-day landmarking, a series of sequential overlapping sub-studies incorporating time-updating proxy measures of the prevalence of infection, and (C) daily landmarking. Regression models were fitted to predict 28-day COVID-19 mortality. Setting Working on behalf of NHS England, we used clinical data from adult patients from all regions of England held in the TPP SystmOne electronic health record system, linked to Office for National Statistics (ONS) mortality data, using the OpenSAFELY platform. Participants Eligible participants were adults aged 18 or over, registered at a general practice using TPP software on 1 st March 2020 with recorded sex, postcode and ethnicity. 11,972,947 individuals were included, and 7,999 participants experienced a COVID-19 related death. The study period lasted 100 days, ending 8 th June 2020. Predictors A range of demographic characteristics and comorbidities were used as potential predictors. Local infection prevalence was estimated with three proxies: modelled based on local prevalence and other key factors; rate of A&E COVID-19 related attendances; and rate of suspected COVID-19 cases in primary care. Main outcome measures COVID-19 related death. Results All models discriminated well between patients who did and did not experience COVID-19 related death, with C-statistics ranging from 0.92-0.94. Accurate estimates of absolute risk required data on local infection prevalence, with modelled estimates providing the best performance. Conclusions Reliable estimates of absolute risk need to incorporate changing local prevalence of infection. Simple models can provide very good discrimination and may simplify implementation of risk prediction tools in practice.