Andres Tamm
Health data scientist
I am a health data scientist and researcher working on applying machine learning methods to electronic health records to improve the understanding of cancer and cancer care. I am supervised by Dr Brian Nicholson as part of the cancer group and by professor Eva Morris (Nuffield Department of Population Health, Big Data Institute). I am particularly interested in optimising the faecal immunochemical test (FIT) for colorectal cancer referrals from primary to secondary care, understanding variations in patient pathways, and conducting reproducible and scalable research with electronic health records.
I recently submitted my thesis in the EPSRC Centre for Doctoral Training in Health Data Science. I explored whether the FIT test can be combined with routinely collected data to increase its precision, developed lightweight text processing tools to extract cancer staging scores from free text clinical reports to facilitate cancer research, and explored methods of automatically clustering patient pathways to study variations in treatment. I also have a highly interdisciplinary background, having previously studied gene technology (BSc) and psychology (BA) in Estonia, and psychological research with substantial statistics component (MSc) in the University of Edinburgh. I have worked with a variety of datasets, including single-cell RNA sequencing, heart rate time series, and functional magnetic resonance imaging.
When I am not engaged in research, I enjoy dancing, mindfulness, and yoga.
Recent publications
External validation of the COLOFIT colorectal cancer risk prediction model in the Oxford-FIT dataset: the importance of population characteristics and clinically relevant evaluation metrics
Journal article
Tamm A. et al, (2025), BMC Medicine, 23
Supporting cancer research on real-world data: extracting colorectal cancer status and explicitly written TNM stages from free-text imaging and histopathology reports
Journal article
Tamm A. et al, (2025), BMJ Health Care Informatics, 32
Improving the understanding of cancer and cancer care by applying data science and machine learning methods to electronic patient records
Thesis / Dissertation
Tamm A., (2025)
BLOod Test Trend for cancEr Detection (BLOTTED): protocol for an observational and prediction model development study using English primary care electronic health record data.
Journal article
Virdee PS. et al, (2023), Diagn Progn Res, 7