Speakers
Description
Building Semantic SDMX for AI-ready Statistics and Interoperability: Challenges, Achievements, Prospects
A major challenge for achieving interoperability lies in the insufficient development of standards for the exchange of Linked Open Statistical Data (LOSD). Current data exchange standards — including SDMX, SIMS, DDI and others — have been developed within an object-oriented paradigm, which is inherently incompatible with the principles of the Semantic Web.
The open non-profit Interoperability Basis initiative focuses on addressing this challenge through the semantic transformation of existing standards, starting with SDMX. The SDMX 2.1 Glossary and SDMX code lists are already available both through the interfaces of the Interoperability Basis Platform (IoBP) and as semantic models. In
addition, the SDMX standard documentation is enriched with hypertext markup that links document content to the corresponding glossary terms. Persistent URIs provide access to these resources in both machine-readable and human-readable formats. This enables the application of FAIR principles to SDMX resources and thereby supports the formation
of a sustainable SDMX semantic layer.
The IoBP provides AI-based tools to support terminology localization while preserving expert review and validation. The Semantic R&D Group, in collaboration with the Statistical Committee of the Commonwealth of Independent States (CISStat) and the Statistical Office of the Republic of Serbia (SORS), is carrying out the localization of
SDMX documentation into Russian and Serbian.
A semantic analysis of SDMX documentation revealed a number of gaps and
inconsistencies, which can be addressed through extended international collaboration.
One of the key areas identified is to focus efforts on updating the SDMX Glossary to version 3.1, together with the preparation of a new release of the SDMX User Guide 3.1.
These activities aim to improve the dissemination of LOSD and rich statistical metadata that can be consistently interpreted and processed by AI-based systems in a transparent and reliable manner.