15–17 Oct 2025
Poblenou Campus Auditorium
Europe/Zurich timezone

Is it really private if you can’t explain it? A practical framework for productonalising legally-compliant synthetic data in government

17 Oct 2025, 10:20
14m
In-Person
Poblenou Campus Auditorium, Barcelona, Spain

Poblenou Campus Auditorium

Roc Boronat, 138 08018 Barcelona

Speaker

Owen Daniel (Office for National Statistics)

Description

Synthetic data is often hailed as the future of safe data access – but in practice, it is insufficient for a method to be mathematically private or analytically useful: if legal and privacy teams do not understand the guarantees, they cannot confidently allow its use. This creates a critical but underexplored tension between cutting-edge privacy techniques and real-world operational requirements: the need for explainability.

The UK’s Office for National Statistics (ONS) have overcome this challenge and present a generalisable framework for productionalising high-fidelity, privacy-preserving synthetic data, designed to meet the UK’s legal and regulatory standards – including UK GDPR and the Statistics and Registration Service Act 2007 – while remaining explainable to non-technical stakeholders.

The framework is built around an adapted version of the MST (Maximum Spanning Tree) method, a state-of-the-art approach to differentially private data synthesis. We demonstrate how we made technical adaptations to MST to allow stakeholder involvement from the outset, and how we reframed key parameter choices such as $(\epsilon,\, \delta)$ in terms of familiar disclosure controls such as cell suppression.

We illustrate this framework in the setting of the generation of an $\epsilon = 1$ differentially-private synthetic linked census and death register data set, providing robust measures of utility of the data, alongside insights into how the ONS are now using this data to enable cross-government data sharing.

Author

Owen Daniel (Office for National Statistics)

Presentation materials