Speaker
Description
Theme:
Generative AI (GenAI) and Machine Learning (ML) in statistical production.
Abstract:
This study outlines a practical initiative by the Department of Statistics, Malaysia (DOSM) with the purpose of developing a responsible, low-risk pathway for exploring Artificial Intelligence in official statistics. Our goal was to understand how AI, when firmly guided by established statistical standards, could be integrated as a support tool to enhance productivity, internal analysis, and data coherence without compromising the quality of our official outputs.
Our methods centered on a dual-track, standards-first approach. We conducted controlled practices in a secure environment, training officers in prompt engineering to use Generative AI for drafting standard metadata from existing templates. In parallel, we explored the application of basic Machine Learning models for internal analytical
support, such as generating forecast scenarios and nowcasts to inform our production planning and quality assessment processes. Importantly, these ML outputs are treated as supplementary analytical aids and not as official statistical releases. Alongside this, we initiated foundational work to systematically review, clean, and document our core data processes and structures, aligning them with SDMX-like standards to create a reliable basis for future automation. The key findings from this exploration were definitive: AI use significantly reduced time spent on initial drafting for routine tasks, firmly establishing its role as an assistant. We found that an officer's ability to craft
precise, standards-specific prompts is a critical new skill.
Furthermore, the pilot use of ML for internal forecasting proved valuable for cross-checking trends and identifying anomalies in ongoing data collection, thereby enhancing our validation processes.
Crucially, the quality and reliability of all AI and ML outputs were directly tied to the rigidity of our input standards and data governance.
In conclusion, this initiative confirms that the immediate value of AI for a National Statistical Office lies in two areas: responsibly accelerating rule-based administrative tasks and providing sophisticated analytical support for internal quality assurance and planning. The initiative underscores that statistical and metadata standards are the
non-negotiable foundation for any successful AI integration, ensuring these tools support rather than subvert official processes. DOSM’s experience demonstrates that investing in prompt engineering skills, exploring ML for auxiliary analysis, and strengthening core data governance are essential first steps, providing a replicable
model for building the trust and readiness required for the future of statistical production.