Protocol for the development and evaluation of a tool for predicting risk of short-term adverse outcomes due to COVID-19 in the general UK population
Hippisley-Cox J., Clift AK., Coupland C., Keogh R., Diaz-Ordaz K., Williamson E., Harrison EM., Hayward A., Hemingway H., Horby P., Mehta N., Benger J., Khunti K., Speigelhalter D., Sheikh A., Valabhji J., Lyons RA., Robson J., Semple C., Kee F., Johnson P., Jebb S., Williams T., Coggon D.
<jats:title>Abstract</jats:title><jats:sec><jats:title>Introduction</jats:title><jats:p>Novel coronavirus 2019 (COVID-19) has propagated a global pandemic with significant health, economic and social costs. Emerging emergence has suggested that several factors may be associated with increased risk from severe outcomes or death from COVID-19. Clinical risk prediction tools have significant potential to generate individualised assessment of risk and may be useful for population stratification and other use cases.</jats:p></jats:sec><jats:sec><jats:title>Methods and analysis</jats:title><jats:p>We will use a prospective open cohort study of routinely collected data from 1205 general practices in England in the QResearch database. The primary outcome is COVID-19 mortality (in or out-of-hospital) defined as confirmed or suspected COVID-19 mentioned on the death certificate, or death occurring in a person with SARS-CoV-2 infection between 24<jats:sup>th</jats:sup> January and 30<jats:sup>th</jats:sup> April 2020. Our primary outcome in adults is COVID-19 mortality (including out of hospital and in hospital deaths). We will also examine COVID-19 hospitalisation in children. Time-to-event models will be developed in the training data to derive separate risk equations in adults (19-100 years) for males and females for evaluation of risk of each outcome within the 3-month follow-up period (24<jats:sup>th</jats:sup> January to 30<jats:sup>th</jats:sup> April 2020), accounting for competing risks. Predictors considered will include age, sex, ethnicity, deprivation, smoking status, alcohol intake, body mass index, pre-existing medical co-morbidities, and concurrent medication. Measures of performance (prediction errors, calibration and discrimination) will be determined in the test data for men and women separately and by ten-year age group. For children, descriptive statistics will be undertaken if there are currently too few serious events to allow development of a risk model. The final model will be externally evaluated in (a) geographically separate practices and (b) other relevant datasets as they become available.</jats:p></jats:sec><jats:sec><jats:title>Ethics and dissemination</jats:title><jats:p>The project has ethical approval and the results will be submitted for publication in a peer-reviewed journal.</jats:p></jats:sec><jats:sec><jats:title>Strengths and limitations of the study</jats:title><jats:list list-type="bullet"><jats:list-item><jats:p>The individual-level linkage of general practice, Public Health England testing, Hospital Episode Statistics and Office of National Statistics death register datasets enable a robust and accurate ascertainment of outcomes</jats:p></jats:list-item><jats:list-item><jats:p>The models will be trained and evaluated in population-representative datasets of millions of individuals</jats:p></jats:list-item><jats:list-item><jats:p>Shielding for clinically extremely vulnerable was advised and in place during the study period, therefore risk predictions influenced by the presence of some ‘shielding’ conditions may require careful consideration</jats:p></jats:list-item></jats:list></jats:sec>