15–17 Oct 2025
Poblenou Campus Auditorium
Europe/Zurich timezone

Does the Synthesis Model Influence a Subsequent Prediction of the Same Model Type?

15 Oct 2025, 10:00
14m
In-Person
Poblenou Campus Auditorium, Barcelona, Spain

Poblenou Campus Auditorium

Roc Boronat, 138 08018 Barcelona

Speaker

Emma Fössing (Institute for Employment Research, Nueremberg, Germany)

Description

Disseminating synthetic data enables easy access to data that retains statistical similarities to the original data if access to sensitive data is restricted. However, the model employed when generating the synthetic data may influence the structure of the data, potentially affecting subsequent predictive analysis. This paper empirically investigates whether the choice of synthesis model impacts the performance of predictive models trained on synthetic data.
Each synthesis model is used to generate synthetic data, which is subsequently analyzed using predictive models of the same type. We empirically evaluate, whether the choice of the synthesis model influences the performance of the predictive models. For example, CART prediction models might perform systematically better on synthetic data generated using CART models than they perform on the original data. We evaluate this hypothesis based on extensive simulations and real data applications.

Authors

Emma Fössing (Institute for Employment Research, Nueremberg, Germany) Prof. Jörg Drechsler (Institute for Employment Research, Nueremberg, Germany; Ludwig-Maximilans-Universität, Munich, Germany; University of Maryland, College Park, USA)

Presentation materials