Workshop on Supporting Standards 2026

Europe/Zurich
In-Person
Wiesbaden, Germany
Description

The role of standards and architectures to ensure successful business transformations in times of disruptive change

Contact
Registration
Workshop on Supporting Standards 2026
  • Saturday 14 February
    • 1
      Advancing Statistical Systems Through Standardisation: The Role of Concept Harmonisation

      Building on core pillars such as centralised data repositories, integrated data-acquisition and statistical compilation processes, and robust governance of reference data and metadata catalogues, Banco de Portugal has been advancing the standardisation and interoperability of its statistical systems.

      This work gives particular emphasis on the harmonisation of statistical concepts as a key enabler of this transformation. By establishing a comprehensive and well-governed dictionary of statistical concepts, this approach aims to ensure semantic consistency across domains, support end-to-end statistical processes, and address the diverse requirements of statistical outputs. In doing so, it demonstrates that concept harmonisation is fundamental to achieving a unified, scalable, and sustainable statistical-compilation system.

      Speaker: Ana Raquel Gonçalves (Banco de Portugal)
    • 2
      Architecture for multi source statistics

      A modernised statistical production will in nature be multi sourced. Administrative registers and other digital data sources will be combined with traditional or complementary sample surveys to create qualitative statistical output while minimising response burden. Furthermore, many new data sources will be useful for several statistical outputs. This situation calls for an architecture that gives the statistical office control over its data and data quality, and how data flows both inside the office and to and from other agents in various data eco systems. Standards like The Generic Statistical Business Process Model (GSBPM), Generic Statistical Information Model (GSIM), Data Documentation Initiative (DDI), Statistical Data and Metadata eXchange (SDMX) and Linked Open Data (LOD) are essential in establishing such an architecture.

      Over the last few years, Statistics Sweden has started implementing such an architecture. This paper describes the fundamental concepts in that architecture, and how we have used various standards in building it. The paper describes our data architecture using “steady states” of data throughout the production process, how data management, data access and data sharing is set up to allow for maximum re-use and combination of data sources while preserving security in data management, and how metadata and quality indicators can facilitate both re-design of statistics to make use of new emerging data sources and make data “AI ready”.

      Speaker: Johan Erikson (Statistics Sweden)
    • 3
      Building Resilient Statistical FOSS Architectures in Times of Disruptive Change

      The progressing data landscape and the evolving requirements of the modern information ecosystems demand that National Statistical Organizations (NSOs) transition from rigid ""stove-pipe"" systems toward standardized, flexible and modular architectures. This business transformation is increasingly powered by Free and Open Source Software (FOSS), which provides the technical and cultural foundation for transparency, efficiency, and international collaboration. The recent adoption of seven fundamental statistical open source software guiding principles by the UNECE Conference of Official Statisticians (CES) provides an excellent starting point for further maturity growth.

      Applied to the field of Generative AI (GenAI) and Machine Learning (ML) the creation of FOSS building blocks that externalize domain knowledge through standardized configuration parameters is still a challenge. Statistical standards can play important role but we need broader collaboration that goes beyond the community of Official Statistics. Innovation in this field is moving fast and re-use of the principles in this relatively new application area is profitable for all. By utilizing truly independent software modules that support core statistical process steps and leveraging semantic interoperability using open standards, NSOs can reuse and share data across various contexts. Moreover, statistical quality profits from standardized building blocks built by many together.

      In this presentation, we explore a number of examples in this field, both existing solutions as well as potential additions to the landscape, with criteria for success in mind, such as favoring generic chainable building blocks and working in the open. We conclude with recommendations for leveraging FOSS architectures, open standards and the power of communities to ensure long-term organizational agility and methodological transparency.

      Speaker: Olav ten Bosch (Statistics Netherlands)
    • 4
      Building Semantic SDMX for AI-ready Statistics and Interoperability: Challenges, Achievements, Prospects

      Building Semantic SDMX for AI-ready Statistics and Interoperability: Challenges, Achievements, Prospects

      A major challenge for achieving interoperability lies in the insufficient development of standards for the exchange of Linked Open Statistical Data (LOSD). Current data exchange standards — including SDMX, SIMS, DDI and others — have been developed within an object-oriented paradigm, which is inherently incompatible with the principles of the Semantic Web.
      The open non-profit Interoperability Basis initiative focuses on addressing this challenge through the semantic transformation of existing standards, starting with SDMX. The SDMX 2.1 Glossary and SDMX code lists are already available both through the interfaces of the Interoperability Basis Platform (IoBP) and as semantic models. In
      addition, the SDMX standard documentation is enriched with hypertext markup that links document content to the corresponding glossary terms. Persistent URIs provide access to these resources in both machine-readable and human-readable formats. This enables the application of FAIR principles to SDMX resources and thereby supports the formation
      of a sustainable SDMX semantic layer.
      The IoBP provides AI-based tools to support terminology localization while preserving expert review and validation. The Semantic R&D Group, in collaboration with the Statistical Committee of the Commonwealth of Independent States (CISStat) and the Statistical Office of the Republic of Serbia (SORS), is carrying out the localization of
      SDMX documentation into Russian and Serbian.
      A semantic analysis of SDMX documentation revealed a number of gaps and
      inconsistencies, which can be addressed through extended international collaboration.
      One of the key areas identified is to focus efforts on updating the SDMX Glossary to version 3.1, together with the preparation of a new release of the SDMX User Guide 3.1.
      These activities aim to improve the dissemination of LOSD and rich statistical metadata that can be consistently interpreted and processed by AI-based systems in a transparent and reliable manner.

      Speaker: Konstantin Laykam (CIS-STAT, YMA Group d.o.o., Slovenia)
    • 5
      DMSS as a Multi-Source Statistics Use Case

      The Decision-Making Support System at Local and National Levels (DMSS LL/NL) is a practical use case demonstrating how multi-source statistics can be effectively leveraged to support evidence-based policymaking and governance across different territorial levels. Developed within the framework of the Statistical Office of the Republic of Serbia (SORS), DMSS combines data from various providers (administrative sources, statistical surveys, and institutional systems) in a harmonized, reusable, and scalable production environment designed to support local and national decision-making.
      This contribution presents key lessons learned from the design and implementation of DMSS, with a particular focus on promoting the use of administrative and multiple data sources, as well as on the role of standardized business and IT architectures in improving data management, interoperability, data governance, and information security. The adoption of common standards, unified metadata frameworks, clearly defined governance models, and security by design principles has enabled controlled data access, ensured data confidentiality, and strengthened trust in multi-source statistical production.
      A central component of DMSS is the development of user-oriented analytical and visualization solutions that transform complex, multi-source datasets into clear, accessible, and actionable insights for decision-makers. These solutions support consistent interpretation of indicators across territorial levels, enabling monitoring, comparison, and evaluation while respecting governance rules and security constraints.
      The paper also discusses how DMSS supports greater organizational flexibility and responsiveness to evolving policy needs and information demands. Flexible and modular system architectures, combined with standardized workflows and reusable components, enable the statistical system to efficiently accommodate new data requirements without compromising quality, security, or governance principles. Practical examples illustrate how methodological standardization and integrated production processes contribute to greater efficiency, continuous improvement, and sustainable modernization of statistical operations, reinforcing the role of official statistics as a reliable foundation for informed policymaking at both national and local levels.

      Speaker: Nebojsa Tolic (Statistical Office of the Republic of Serbia (SORS))
    • 6
      Designing an integrated platform for statistical and geospatial information production

      Faced with rapid technological shifts, a surge in data sources, and rising demands for high-quality information, INEGI has developed a strategic proposal to modernize its statistical and geographical production. This initiative centers on the integration and operationalization of international standards to create a more cohesive production environment.

      At the heart of this strategy is the Information Production Platform (IPP), which takes advantage of GSBPM as its foundational process management framework, complemented by many other key standards such as GSIM, DDI, GSGF, and SDMX. The IPP’s primary goal is to bridge the gap between process design and execution within a shared digital ecosystem, ensuring a seamless flow between data and activities. By adapting Business Process Management (BPM) principles, the platform will systematize workflows, providing clear visibility into roles, inputs, and information outputs.

      By standardizing information objects and their metadata, the platform seeks to ensure methodological consistency, foster interoperability, and maximize the reuse of technological assets. Ultimately, the IPP aims to transform international standards into tangible operational capabilities, resulting in a resilient production model that is ready to navigate a fast-changing global landscape.

      Speaker: Raúl Mejía (National Institute of Statistics and Geography (INEGI, México))
    • 7
      From Search to “Talk to Statistics”: SORS’s Trust‑First GenAI Navigator

      Users increasingly expect to “talk to statistics” in plain language, yet official statistics must remain authoritative, confidential, and fully verifiable. The Statistical Office of the Republic of Serbia (SORS) is addressing this challenge by developing and rolling out an AI dissemination chatbot on its official channels (www.stat.gov.rs and data.stat.gov.rs) with a clear trust-first design: public, SDC-protected content only, transparent referencing, and no tolerance for invented indicators or sources.

      Our contribution shares a practical blueprint for aligning GenAI with statistical standards and enterprise architecture, and for deploying it in phases with explicit evaluation gates. The initial release focuses on what brings immediate user value with minimal risk: an “intelligent navigator” that understands questions in Serbian (both scripts) and English, and directs users to the most relevant official tables, indicators and publications keeping the dissemination system as the single source of truth.

      Subsequent phases, subject to successful evaluation and management approval, extend capabilities toward controlled cross-dimensional exploration and carefully governed numeric summaries still fully traceable to published outputs.

      We end by summarizing what it takes to turn GenAI into a high-value user service reliable quality, secure deployment, and controlled cost and usage and by identifying where standards, metadata discipline, and concept management must evolve to support large-scale adoption.

      Speaker: Adil Kolaković (Statistical office of the Republic of Serbia)
    • 8
      Leveraging Generative AI and Metadata Standards to Enhance Researcher Support

      We will present preliminary outcomes of an ongoing project at Statistics Netherlands (CBS), to be carried out between February and June 2026. It is an extension of prior efforts and builds upon an existing first proof of concept. The project explores the potential of Large Language Model (LLM)-based Retrieval-Augmented Generation (RAG) to support researchers in efficiently identifying relevant variables and automatically generating analysis datasets from CBS microdata.

      CBS provides access to detailed microdata (under strict conditions) to researchers from academia and public agencies, enabling new insights in social and economic sciences. However, due to the large and complex nature of the microdata, researchers often face significant challenges in locating relevant variables and assembling datasets. This project aims to address these challenges by exploring how AI and (metadata) standards – such as the Generic Statistical Information Model (GSIM) – can streamline and enhance the data discovery and dataset creation process.

      We plan to present an initial prototype system that demonstrates how LLMs and RAG can be used to support researchers in variable discovery and dataset creation. This system will build on the existing proof of concept and will be designed to leverage CBS metadata standards, to improve the structure and organization of metadata for more effective use by LLMs. The prototype will be tested through interactive interfaces, such as chatbots or research assistants, to evaluate the potential of RAG-based systems in supporting researchers throughout the data discovery and dataset assembly process.

      In addition to presenting the prototype, we will share preliminary recommendations for future development and integration of AI tools within CBS's statistical production workflows. We will also provide a GSIM mapping of the relevant metadata elements used in the project, offering a structured reference for future enhancements and standardization efforts.

      Speaker: Harold Kroeze (Statistics Netherlands (CBS))
    • 9
      Modernising Establishment Statistics at the Central Statistics Office, Ireland: A Three Pillar Approach

      National Statistical Offices (NSOs) globally are under pressure to deliver more comprehensive and timely statistics, whilst operating in an increasingly complex survey environment. Declining response rates, rising costs, increasing respondent burden and expanding operational workloads are forcing NSOs to re-evaluate traditional approaches and to pursue innovative alternatives.

      In response to these challenges, the Enterprise Statistics Division at the Central Statistics Office (CSO) Ireland is undertaking holistic reform of business data collection and processing. Under the three pillars of Data Utility, Infrastructure and Processes the Division is delivering new insights, better technologies and streamlined operational workflows.

      Within this framework, the structure of data and the technological layer of the organisation are treated as interdependent, requiring cohesive development to deliver new processes that can address these challenges.

      This paper will outline how the ‘Data Utility’ pillar delivered streamlined questionnaire designs, developed metadata, and is driving the Division’s transition to multi-source statistical production, significantly reducing operational workload and respondent burden. It presents the critical importance of data utility in the context of the ‘Infrastructure’ pillar, which is now delivering a metadata-driven systems approach to data capture and processing. This places businesses at the centre of the new model and shifts focus from a siloed ‘survey first’ approach, to ‘data first’ production, bolstered by Ireland’s National Data Infrastructure and the CSO’s role as National Data Steward.

      This paper will example how through coherent development, new processes have already come on-line. These include the development of data ‘spines’ with a secondary-data-first emphasis, and a new ‘active collection management’ approach to surveying.

      Speaker: Ewan Mullane (Central Statistics Office, Ireland)
    • 10
      Reengineering Data Access Governance in Central Bank Statistics: A Bank Indonesia Standards-Based Architecture for Requests, Auditability, and Monitoring

      Bank Indonesia’s digital transformation increasingly depends on integrated, secure, and policy-responsive data and information management. In this setting, internal data access governance is a pivotal control to ensure effective data use by Bank Indonesia staff while safeguarding confidentiality, ensuring accountability, and protecting public trust. Yet digitalisation initiatives often underdeliver meaningful governance improvements when they merely automate legacy procedures. This paper argues that effective digitalisation of access management must be preceded by thorough business process reengineering (BPR), built on a comprehensive understanding of “as-is” processes, its control weaknesses, and operational pain points.

      This paper presents Bank Indonesia’s experience in reengineering and digitalising statistical data access management at Bank Indonesia through the development of the Integrated Digital Data Access Governance System (DATRA). The initiative began with end-to-end process mapping of the existing access lifecycle (request, approval, provisioning, modification, and revocation), followed by a structured gap analysis. Key deficiencies identified included fragmented workflows, limited standardisation of requests, weak audit trails, and constrained monitoring and reporting. These findings were translated into a redesigned “to-be” process that explicitly embeds governance requirements directly into operational steps and control points including compliance-by-design principles, role-based access control (RBAC), and the need-to-know rules.

      DATRA operationalises the reengineered process as an interim, self-service solution aligned with the Bank’s broader data and information governance framework. Leveraging a taxonomy-based data catalogue, entitlement packages, and workflow automation, DATRA standardises access requests, perform rule-based checks of user authority, accelerates provisioning through traceable approvals, and enables timely updates to access-rights records and logs.

      The paper demonstrates that the combined BPR-and-digitalisation approach delivers improvements beyond administrative efficiency. It strengthens internal statistical data governance by enhancing transparency, accountability, and consistency in access control, while establishing a scalable foundation for information security and sustainable digital transformation.

      Speaker: She Asa Handarzeni (Bank Indonesia)
    • 11
      Reusing, sharing, linking and standardising semantically rich metadata at Insee

      Insee has been implementing for many years an ecosystem of repositories dealing with standardised metadata for statistical purposes. The finest level is currently the instance variable and its representation (numeric, text, code list, etc.).These objects serve multiple purpose and are consumed by several internal and external stakeholders:
      - An internal platform to centralise ready-for-use datasets
      - Archival system of data files
      - Generation of codebooks
      - Data structure and documentation of research files made available in a restricted and secured manner
      These objects are currently expressed as DDI 3.3 fragments and managed by the Colectica tools suite.
      In addition, a substantial effort is made to enrich the content of a core set of shared resources and to expand their reuse across the instance and represented variables. Those are mainly geographic information (list of regions, municipalities, etc.) and statistical classification (activities, products, occupations, etc.). All these items follow strict structural (hierarchy between objects) and naming conventions.
      In parallel, attempts to harmonising and detecting similar variables or code lists on one hand, and associating variables to concepts that they measure using similarity metrics and embeddings techniques are underway. They aim to enhance significantly the metadata quality, reduce the number of items feeding search engines and indexers, streamline their management.

      Next steps are expected to:
      - Extend the documentation of the statistical dataset to all internal data producers
      - Enrich the semantics and the scope of the information made available:
      o associate concepts to variables by completing the variable cascade (conceptual variable, represented variable, instance variable)
      o add harmonised unit types at different granularity levels (physical instances, logical records, variables, etc.)
      o add harmonised sentinel values to the variable representations

      Speaker: Guillaume Duffes (Insee)
    • 12
      Semantic Croissant, CDIF and AI-Driven Data Annotations

      This talk will introduce the Semantic Croissant ecosystem created around Croissant for Machine Learning standard, with a focus on ontology alignment with ML and the linkage of metadata to external controlled vocabularies through the Cross-Domain Interoperability Framework (CDIF). It will also highlight how these components support semantic consistency and interoperability across research domains.

      Participants will also be introduced to Nectar Publisher, a human-in-the-loop platform integrated into the Dataverse data repository. The platform enables researchers to combine Large Language Models (LLM) with knowledge graphs to support AI-assisted data annotation. It facilitates Generative AI driven detailed variable extraction and description, including units of measurement, classes, attributes, and properties, while maintaining human oversight and control.

      The presentation is relevant for stakeholders interested in AI-enabled data management, semantic integration, and the development of interoperable, next-generation research infrastructures.

      Speaker: Slava Tykhonov (CODATA)
    • 13
      Supporting statistical process optimisation through standards integration: The case of the Pacific Community

      The Pacific Community’s Statistics for Development Division (SDD) is implementing a division‑wide business process optimisation programme to modernise statistical production through standards‑aligned process re‑engineering. Core activities, such as mapping the organisational structure to the GSBPM, defining governed data steady states, and redesigning end‑to‑end workflows using BPMN, provide a structured foundation for consistent, transparent, and service‑oriented production architecture.
      A central enabler of this transformation is the combined use of DDI Codebook and SDMX across the statistical lifecycle. The Pacific Data Hub Microdata Library serves as the authoritative source for DDI‑documented microdata, which is processed by the Pacific Community on behalf of National Statistical Offices of the region. Statistics are disseminated using the SDMX standard on a .Stat Suite data portal also part of the Pacific Data Hub, or similar portals deployed by member countries and territories. The paper will present approaches for using DDI and SDMX together to establish a robust bridge between microdata and aggregated statistical outputs, covering both statistical data and associated descriptive metadata.
      By embedding standards such as DDI, SDMX and BPMN within redesigned GSBPM‑aligned processes, the Pacific Community is creating an integrated data and metadata ecosystem linking microdata, aggregated statistics, and their explanatory metadata. The Pacific experience illustrates how standards‑based process optimisation enhances efficiency, transparency, and repeatability in times of rapid technological and organisational change and ever‑more‑volatile demand for statistics. Enhancing the AI‑readiness of Pacific informational assets through standard, machine‑interpretable metadata is a key objective in a region with limited analytical capacity.

      Speaker: Denis Grofils (Pacific Community)
    • 14
      The role of standards in the new Istat metadata system “METAStat”

      Istat is releasing the first functionalities of a new metadata system whose aim is to manage structure and reference metadata together, as well as the semantic component that assigns a clear and unambiguous meaning to statistical metadata. These three assets, i.e. the system module on statistical process description, data description and semantics, are closely related and guarantee data are well understood, reliable and easy to access. They are also based on a core of harmonised concepts that only need to be managed once in the system for the whole Institute, thus supporting reuse of information and interoperability across other internal systems.
      Actually, each module is based on at least one international standard: GSBPM for process description with elements of the Process group in GSIM that could be a complement and SDMX as a typical output requested by Eurostat; GSIM for concepts and data structure description, with the necessity to manage data and metadata dissemination in SDMX; ISO standards for semantics and terminology, where each term could play a statistical role in data description as a variable, classification category etc. and hence should interact at least with GSIM.
      Standards compliance has been the foundation for achieving the interoperability goals set by METAstat. In this regard, the system releases metadata to the Italian National Data Catalog, a public administration tool for the interoperability of public administration data. To this end, the relationship between the model in METAstat and ontologies will be sketched
      The system is also aimed to become a tool to monitor and evaluate statistical processes and their quality, producing quality reports according to the ESS standard SIMS and collecting standard quality indicators, such as the ones defined by Istat to monitor multisource statistical registers.

      Speaker: Emanuela Recchini (Italian National Institute of Statistics (Istat))
    • 15
      Towards StatOps: Operationalising Official Statistics Production by Reusing and Extending MLOps Practices

      Official statistics is rapidly adopting cloud based and open-source self-service production platforms, modern programming languages, and more machine learning in production. In parallel, generative AI is becoming a natural part of development and maintenance work. These shifts are often aimed at improving speed, reproducibility, and reuse, but also increase the need for systematic operational practices that protect quality, trust, and stability.

      Promising developments already point in this direction, including Reproducible Analytical Pipelines (ONS), Principles for Statistical Production (BLS), work on Implementation Frameworks within the ESS AIML4OS programme and examples of quality requirements on production systems developed at several NSIs including Statistics Sweden. However, the community still lacks a shared operational level framing that connects such efforts and supports reuse across organisations.

      This paper proposes starting a coordinated UNECE effort towards “StatOps”, using an MLOps inspired structure across Principles, Components, Roles, and Architecture, but grounded in end-to-end statistical production. The aim is to capture practical patterns such as version control, automated testing, CI/CD, controlled execution environments, and monitoring, and to make them reusable within the roles and competencies typical for statistical organisations. A further open question is how StatOps should incorporate lessons on where generative AI can augment specialist expertise in statistical production, without weakening methodological responsibility or institutional control.

      Key words: StatOps, statistical production, MLOps, generative AI, implementation frameworks, GSBPM

      Speaker: Jakob Engdahl (Statistics Sweden and possibly more NSIs contributing)
    • 16
      Towards a Dynamic Quality Framework: Aligning Innovation, AI, and New Data Sources in European Statistics

      In the context of the ESS Innovation Agenda, quality assurance must evolve to meet the challenges posed by new technologies and data sources. This contribution reflects on three challenges shaping that evolution: the integration of Artificial Intelligence (AI) and Machine Learning in statistical processes, the operationalisation of Statistics under Development (SuD), and the growing reliance on non-traditional sources like privately held data (PHD). These challenges call for a shift towards more dynamic and context-dependent approaches to quality. Rather than applying a fixed model, emerging practices focus on tailoring quality requirements to specific use cases and development stages. Across these areas, we see frameworks that allow for gradual alignment with official quality standards, incorporate quality-by-design principles, and support innovation while maintaining trust. Within this transformation, the European Statistics Code of Practice (CoP) and the more operational Quality Assurance Framework (QAF) play an important role. While the CoP structure continues to provide a stable high-level compass, its application increasingly calls for interpretations, flexible implementation, and targeted extensions—particularly when working with AI or with complex processes involving external actors such as private data holders. The presentation will highlight common threads across these initiatives and illustrate how standards, principles and architectures can support innovation without compromising on the fundamental values of trust, transparency, and accountability. Examples will be drawn from recent ESS work on AI strategy and governance, quality profiling for SuD, and reproducible workflows for one particular type of PHD data, namely Mobile Network Operator data.

      Speaker: Jean-Marc Museux (Eurostat Unit 02 : Innovation and digital transformation)
    • 17
      Update on UNECE's Statistical Architecture Framework

      Some work is being undertaken on a new "Statistical Architecture Framework" under UNECE's Supporting Standards Group. This presentation will give an update on work to date.

      Speaker: Daan Swinkels
    • 18
      Using a Standards-Based Architecture to Power StatGPT and Data Governance at the IMF

      The IMF is modernizing how statistical data and metadata are produced and managed in order to support both AI-enabled access and enterprise data governance. This work is built around two closely related use cases. First, a standardized metadata model mapped to Dublin Core, DCAT, and SDMX Global Data Structure Definitions (DSDs) is used to generate consistent semantic descriptions for the IMF’s SDMX APIs. This enables StatGPT and other AI tools to interpret datasets, indicators, dimensions, sources, and usage conditions directly from metadata rather than from documents or hard-coded rules. Second, the IMF is developing a knowledge-graph-based Data Catalog to manage datasets, lineage, approvals, lifecycle states, and access controls as structured, interlinked objects. This allows StatGPT to understand not only what a dataset contains, but also its provenance, position in the statistical lifecycle, approval status, and release conditions. The presentation shows how a standards-based architecture that brings together GSBPM-based lifecycle management, SDMX artefacts, metadata standards, and knowledge-graph-based governance enables scalable, AI-enabled access to official statistics while strengthening enterprise data governance.

      Speaker: Denisa Popescu (IMF)