Posts tagged: Data
Multiple Imputation and Perturbation: Why They're Not Built for Synthetic Data
This blog explores why multiple imputation and perturbation are not suitable for generating synthetic data.
Clinic to Code to Care
This blog came out of a talk Steph Jones and I gave at Women in Data and AI in October 2025. It explores the journey of information from a patient in clinic to how that information is coded for research and ultimately ends up informing statistical and machine learning models that can help improve patient care.
What is Synthetic Data and Why Does it Matter?
This blog is the first in a series exploring synthetic data, its benefits, and its applications in various fields.
Finding Similarity with Vector Search: A Beginner's Guide
This blog comes out of an interactive workshop I gave using SurrealDB. It's a beginner's guide to vector search, a modern way to find matches based on multiple preferences at once.
An Introduction to Electronic Health Records
A quick primer on what is an electronic health record and how it is used in clinical practice and research. This post is UK focussed but the principles are the same in many other countries.