Healthcare AI has a trust problem – and patients with complex conditions are paying the price
Artificial intelligence could transform care for the millions of people in England living with multiple long-term conditions. Large NHS datasets hold decades of patient records. Sophisticated algorithms can spot patterns no clinician could find alone. But a stubborn gap persists between what AI can do in a research lab and what it actually delivers in a GP surgery.
The problem is not a shortage of clever models. It is a shortage of infrastructure – standardised ways to prepare messy real-world data, rigorously compare competing approaches, and deploy tools that clinicians and patients actually trust. Without that infrastructure, promising AI projects stall as pilots, and the people who most need better care see no benefit.
Our approach and partners
Rather than build another standalone AI tool, this project created an end-to-end ecosystem for clinical AI development – from raw data to implementation-ready system.
The work began with data. A high-performance encoding pipeline transformed 6.8 million patient records from the Clinical Practice Research Datalink (CPRD) into analysis-ready formats, harmonising primary care records with linked NHS England hospital, A&E, and mortality data. Eight machine learning model families – from logistic regression to deep neural networks – were then systematically trained and evaluated on this data. A reinforcement learning framework benchmarked these models against each other, replacing ad hoc selection with a principled, automated approach.
But technical infrastructure alone does not build trust. A four-day Human-Centred AI (HcAI) Design Bootcamp, developed with imec-SMIT (Vrije Universiteit Brussel), Stanford University's Product Realization Lab, and The Alan Turing Institute, brought clinicians, patients, researchers, and designers together to co-create AI solutions grounded in real clinical needs.
Connecting these strands is the Health xAI Implementation Framework – published in iScience – which sets out how to deploy AI across multiple healthcare sites while managing privacy, explainability, and ongoing monitoring.
What we found – and why it matters
- Simpler models often outperformed complex ones. When tested on 2.8 million patient records, classical machine learning – logistic regression for mortality prediction, gradient boosting (XGBoost) for hospitalisation – frequently matched or beat deep learning on tabular clinical data. This has significant implications for where the NHS invests its AI resources.
- Automated model selection works. The reinforcement learning framework reached 75% accuracy in identifying the best-performing model within just 2,000 evaluation rounds. All three bandit algorithms converged on the same model rankings, validating the robustness of the approach.
- Co-design with diverse stakeholders produces usable tools. The HcAI Bootcamp attracted over 260 registrants. Twenty-eight selected participants from 15 different professional backgrounds developed three working prototypes: an LLM-powered health literacy tool providing personalised plain-language explanations, PILLAR – an AI system for precision medication dosing based on kidney function, and an automated patient visit summary tool using speech recognition. Teams that included patient representatives showed greater attention to practical usability.
- Open access lowers the barrier to entry. All resources – the CPRD data encoder, benchmarking code, trained model configurations, and bootcamp design templates – are freely available through the Open Science Framework, enabling other research teams to build on this work without starting from scratch.
What this means
This project provides a reproducible blueprint for moving the NHS from isolated AI experiments to systematic digital transformation. For patients with multiple long-term conditions, it means AI tools designed around their actual needs rather than technical convenience. For the health system, it means an evidence-based approach to selecting, deploying, and governing clinical AI that prioritises safety and equity alongside performance.
The key predictors identified across models were clinically coherent – age, haemoglobin, cancer incidence, inflammation markers, and multimorbidity burden – supporting integration into existing care pathways and building the clinical trust essential for adoption.
What needs to happen next
Three things need to change. First, the tools must move from offline simulation to real-time validation within NHS trusts – the gap between retrospective accuracy and prospective utility remains significant. Second, lighter-weight versions of the pipeline and benchmarking tools are needed for research teams without access to high-performance computing. Third, regulators – particularly MHRA and NICE – need to work with the research community to establish reinforcement learning and human-centred design methods as recognised standards for clinical AI governance.
Without regulatory engagement, even well-validated tools will struggle to move from research output to approved clinical system.
Lead researcher:
Sami Adnan, Nuffield Department of Primary Care Health Sciences, University of Oxford
Contact: sami.adnan@phc.ox.ac.uk
ARC OxTV theme: Novel Methods
Alignment with the 10 Year Health Plan for England:
This work directly supports the analogue to digital shift by creating standardised infrastructure for electronic health record transformation and systematic AI deployment. It also enables the sickness to prevention shift through scalable predictive tools that identify deterioration risk in patients with multiple long-term conditions.
NIHR narrative themes:
- Innovation – A novel end-to-end methodology for clinical AI, from data encoding through reinforcement learning-based model selection to a published implementation framework for multi-site deployment.
- Impact – Capacity building across 15 professional disciplines, with open-access tools enabling reproducible research on multiple long-term conditions using 6.8 million patient records.
- Investment – Open-access infrastructure reduces duplication of effort across NHS AI programmes, and evidence that simpler models can match complex ones offers a more cost-effective path to clinical AI.
Partners:
imec-SMIT (Vrije Universiteit Brussel); Stanford University Product Realization Lab; The Alan Turing Institute
Key resources:
- Health xAI Implementation Framework (iScience, 2025)
- OSF Repository – HcAI Bootcamp methodology and design templates
- OSF Repository – CPRD encoding pipeline
- OSF Repository – model training and RL benchmarking
- OSF Repository – implementation framework
- Pre-print – HcAI Bootcamp study
- HcAI Design Bootcamp website
- CPRD approved study protocol
What continues beyond ARC funding:
All tools, frameworks, and trained models are open-access, enabling continued use and adaptation. The bootcamp model is designed for replication with future cohorts, and established international partnerships continue collaborative development.