Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Background The relevance of covert cerebrovascular disease (CCD) in practice is uncertain, partly because estimation of risk in whole clinical populations is difficult. Studies have had success extracting CCD from clinical text using natural language processing (NLP), though they have been limited to specific CCD phenotypes. Here, we used NLP to measure multiple clinically-reported CCD phenotypes in a large clinical cohort and estimated subsequent disease risk in health record data. Methods From all people with brain imaging in Scotland (2010–2018), we selected people with no prior hospitalisation for neurological disease (n=367 988). NLP of imaging reports identified: white matter hypoattenuation or hyperintensities (WMH), lacunes, cortical infarcts and cerebral atrophy. Adjusted HRs (aHRs) were estimated between each phenotype and stroke, dementia and Parkinson’s disease (conditions previously associated with CCD), epilepsy and colorectal cancer (control conditions). Results For each phenotype, the aHR of stroke was WMH 1.4 (95% CI 1.3–1.4), lacunes 1.6 (1.5–1.6), cortical infarct 1.8 (1.7–1.9) and cerebral atrophy 1.1 (1.0–1.1). The aHR of dementia was WMH 1.3 (1.3–1.3), lacunes 1.0 (0.9–1.0), cortical infarct 1.1 (1.1–1.2) and cerebral atrophy 1.7 (1.7–1.8). The aHR of Parkinson’s disease was WMH 1.1 (1.0–1.2), lacunes 1.1 (0.9–1.2), cortical infarct 0.7 (0.6–0.9) and cerebral atrophy 1.4 (1.3–1.5). The aHRs between CCD phenotypes and epilepsy and colorectal cancer were around the null. Conclusion CCD and atrophy have implications for future disease risk and can be identified at scale using NLP of clinical reports. Prevention of neurological disease in people with CCD should be a priority for healthcare policy makers.

More information Original publication

DOI

10.1136/jnnp-2025-337689

Type

Journal article

Publisher

BMJ

Publication Date

2026-03-13T00:00:00+00:00