Speaker
Description
This contribution addresses the intersection of statistical disclosure control and the special requirements of psychological research. Exemplarily, we show the unique sensitivity and complexity of empirical data from psychological research and the problems and possibilities to anonymize them.
The replication crisis in psychology (Open Science Collaboration, 2015; Camerer et al., 2018) has highlighted the need for open science practices that prioritize transparency and reproducibility. However, the open and even the restricted sharing of psychological research data presents significant challenges due to their inherently sensitive nature. Psychological datasets frequently contain personal and confidential information, such as assessments of personality traits, personal interests inventories, IQ tests, ability and aptitude tests, attitude scales, as well as clinical data related to mental health conditions. Preserving the privacy of study participants and ensuring the confidential handling of data are, therefore, critical aspects of psychological research. The application of statistical disclosure methods is one of the most cost-effective and efficient ways of preserving the privacy of individuals compared to privacy-preserving computation methods (Templ & Sariyar, 2022). However, awareness and utilization of anonymization methods within the field of psychology remain limited. Furthermore, psychological research frequently employs complex data structures, such as longitudinal datasets that capture changes over time (Arseneault et al., 2023), hierarchical data that encompass nested relationships, and scales that comprise multiple sets of sensitive items. These characteristics complicate effective anonymization, creating an intricate balance between preserving data utility for scientific inquiry and ensuring participant privacy. Addressing these challenges requires researchers to possess the necessary methodological expertise in advanced anonymization techniques to publish data securely. This contribution addresses the intersection of statistical disclosure control and the special requirements of psychological research. Exemplarily, we show the unique sensitivity and complexity of empirical data from psychological research and the problems and possibilities to anonymize them. The procedures and methods are illustrated using the R package sdcMicro (Templ, Kowarik, & Meindl, 2015). Keywords— Psychology, Open Science, Anonymization, Statistical Disclosure Control References Arseneault, L., Bolivar, M., Bryan, B., Canning, T., Evans, E., Gatera, G., ... Yu, D. (2023). Landscaping international longitudinal datasets: Full report. Retrieved from https://www.landscaping-longitudinal-research.com/what-we-found (Accessed: 26 March 2025) Camerer, C. F., Dreber, A., Holzmeister, F., Ho, T.-H., Huber, J., Johannesson, M., ... Wu, H. (2018). Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015. Nature Human Behaviour, 2(9), 637–644. doi: 10.1038/s41562-018-0399-z Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251). doi: 10.1126/science.aac4716 Templ, M., Kowarik, A., & Meindl, B. (2015). Statistical disclosure control for micro-data using the R package sdcMicro. Journal of Statistical Software, 67(4), 1–36. doi: 10.18637/jss.v067.i04 Templ, M., & Sariyar, M. (2022). A systematic overview on methods to protect sensitive data provided for various analyses. International Journal of Information Security, 21(6), 1233–1246. doi: 10.1007/s10207-022-00607-5