Expert Meeting on Statistical Data Confidentiality

Name: Expert Meeting on Statistical Data Confidentiality
Start: 2025-10-15T08:45:00+02:00
End: 2025-10-17T16:00:00+02:00
Location: Poblenou Campus Auditorium

15–17 Oct 2025

Poblenou Campus Auditorium

Europe/Zurich timezone

Chris Jones

jonesc@un.org

Machine learning methods to detect Correct Perturbation

16 Oct 2025, 14:35

14m

In-Person

Poblenou Campus Auditorium, Barcelona, Spain

Poblenou Campus Auditorium

Roc Boronat, 138 08018 Barcelona

Machine Learning and Artificial Intelligence versus Disclosure Control

iain dove (Office for National Statistics)

Trusted research environments have historically used rounding and thresholding as the recommended disclosure control method for exports of population data. However, within ONS Trusted Research Environments, for some datasets, perturbation is allowed in combination with thresholding. Code has been made available so researchers can create perturbed outputs using a specific level of noise.
This creates a problem: how can export checkers tell if an output has been correctly perturbed? Even with supporting information showing the raw counts, it is not obvious that a researcher has used the right method and parameters to create the perturbed counts for export.
To this end, machine learning methods were trialled on a set of synthetic training data (n=5000). Training data was created using perturbation code so datasets would resemble genuine exports. Five different types were produced, 50% were generated with the ‘correct’ method and parameters. Logistic Regression, XGBoost, Random Forest, K Nearest Neighbours, Naive Bayes and Support Vector Machine models were trained and evaluated.
This paper explores the results and how these models could be applied in the Trusted Research Environment context

Samantha Trace (United Kingdom of Great Britain and Northern Ireland)

UNECE_machine_learning_template_ST.pdf

Expert Meeting on Statistical Data Confidentiality

Chris Jones

Machine learning methods to detect Correct Perturbation

Poblenou Campus Auditorium

Speaker

Description

Author

Presentation materials