Identification of patients undergoing chronic kidney replacement therapy in primary and secondary care data: validation study based on OpenSAFELY and UK Renal Registry.
Santhakumaran S., Fisher L., Zheng B., Mahalingasivam V., Plumb L., Parker EP., Steenkamp R., Morton C., Mehrkar A., Bacon S., Lyon S., Konstant-Hambling R., Goldacre B., MacKenna B., Tomlinson LA., Nitsch D.
OBJECTIVE: To validate primary and secondary care codes in electronic health records to identify people receiving chronic kidney replacement therapy based on gold standard registry data. DESIGN: Validation study using data from OpenSAFELY and the UK Renal Registry, with the approval of NHS England. SETTING: Primary and secondary care electronic health records from people registered at 45% of general practices in England on 1 January 2020, linked to data from the UK Renal Registry (UKRR) within the OpenSAFELY-TPP platform, part of the NHS England OpenSAFELY covid-19 service. PARTICIPANTS: 38 745 prevalent patients (recorded as receiving kidney replacement therapy on 1 January 2020 in UKRR data, or primary or secondary care data) and 10 730 incident patients (starting kidney replacement therapy during 2020), from a population of 19 million people alive and registered with a general practice in England on 1 January 2020. MAIN OUTCOME MEASURES: Sensitivity and positive predictive values of primary and secondary care code lists for identifying prevalent and incident kidney replacement therapy cohorts compared with the gold standard UKRR data on chronic kidney replacement therapy. Agreement across the data sources overall, and by treatment modality (transplantation or dialysis) and personal characteristics. RESULTS: Primary and secondary care code lists were sensitive for identifying the UKRR prevalent cohort (91.2% (95% confidence interval (CI) 90.8% to 91.6%) and 92.0% (91.6% to 92.4%), respectively), but not the incident cohort (52.3% (50.3% to 54.3%) and 67.9% (66.1% to 69.7%)). Positive predictive values were low (77.7% (77.2% to 78.2%) for primary care data and 64.7% (64.1% to 65.3%) for secondary care data), particularly for chronic dialysis (53.7% (52.9% to 54.5%) for primary care data and 49.1% (48.0% to 50.2%) for secondary care data). Sensitivity decreased with age and index of multiple deprivation in primary care data, but the opposite was true in secondary care data. Agreement was lower in children, with 30% (295/980) featuring in all three datasets. Half (1165/2315) of the incident patients receiving dialysis in UKRR data had a kidney replacement therapy code in the primary care data within three months of the start date of the kidney replacement therapy. No codes existed whose exclusion would substantially improve the positive predictive value without a decrease in sensitivity. CONCLUSIONS: Codes used in primary and secondary care data failed to identify a small proportion of prevalent patients receiving kidney replacement therapy. Codes also identified many patients who were not recipients of chronic kidney replacement therapy in UKRR data, particularly dialysis codes. Linkage with UKRR kidney replacement therapy data facilitated more accurate identification of incident and prevalent kidney replacement therapy cohorts for research into this vulnerable population. Poor coding has implications for any patient care (including eligibility for vaccination, resourcing, and health policy responses in future pandemics) that relies on accurate reporting of kidney replacement therapy in primary and secondary care data.