Caroline Morton

Posts tagged: Health Data

All Posts Data (13) Health Data (13) Rust (12) PhD (8) Reproducibility (8) Synthetic Data (8) Open Science (7) Errors (6) Software Reliability (6) Systems Programming (6) Women in Rust (6) Electronic Health Records (5) SNOMED (5) Codelists (4) Data Pipelines (4) Data Privacy (4) Machine Learning (4) AI (3) Data Science (3) Readability (3) Scientific Software (3) Data Quality (2) Adapter Pattern (1) Code Review (1) Data Validation (1) Databases (1) Design Patterns (1) Epidemiology (1) Functional Programming (1) GANs (1) Health Tech (1) ICD-10 (1) Maintainability (1) Mental Models (1) Observability (1) OMOP (1) OPCS (1) Privacy Metrics (1) Representativeness (1) Research Code (1) Scientific Computing (1) Serde (1) SurrealDB (1) Testing (1) Type System (1) Vector Search (1) Women in Tech (1)

Is your Synthetic Data actually private?

A practical guide to the three privacy risks in synthetic data, the metrics that quantify them, and why no single number tells you whether your data is safe.

Representativeness in Synthetic Data: What It Means and How to Measure It

Understanding the concept of representativeness in synthetic data and the methods used to measure it.

Your Errors Are Data Too

How Rust's error handling patterns let you treat errors as structured observations about your data - capturing context, categorising failures, and producing data quality reports as first-class pipeline outputs.

Why Use Newtypes? Encoding Domain Knowledge in the Type System

How Rust's newtype pattern lets you encode domain knowledge - valid ranges, clinical thresholds, meaningful operations - directly into the type system, so the compiler enforces what you already know to be true about your data.

How Synthetic Data Is Used in Healthcare, Research and Beyond

Explore real-world use cases for synthetic data in healthcare, clinical trials, finance and more.

How to Create a Codelist

A practical guide to creating codelists from scratch for health data research.

What is a Codelist?

What are codelists and why do they matter? An accessible introduction to coding systems, clinical nuance, and research reproducibility.

What are GANs and how can they generate synthetic data?

This blog explores Generative Adversarial Networks (GANs) and how they can be used to generate synthetic healthcare data.

Clinic to Code to Care

This blog came out of a talk Steph Jones and I gave at Women in Data and AI in October 2025. It explores the journey of information from a patient in clinic to how that information is coded for research and ultimately ends up informing statistical and machine learning models that can help improve patient care.