ESCAP has organized a series of training webinars on web scraping for consumer price indices from May to September 2024. The webinars have covered sharing of experiences, challenges of web scraping, ethical issues in web scraping, the development and application of ICT tools for web scraping in Python. This training process has also included a set of homework exercises. Participants have also benefited from the expertise mentors to enhance their knowledge.
This in-person workshop is a follow-up from the webinars in Bangkok, Thailand on 16-20 September 2024, and which is the focus on this concept note. The overall aim of the workshop is to strengthen web scraping skills acquired through a series of webinars undertaken since May 20204.
Objectives and Expected Outcomes
The in-person workshop's overall aim is to embed the knowledge and skills delivered to participants via the webinars May through August 2024. By the end of the workshop, participants will:
• Have embedded Python coding knowledge and skills for web scraping prices data and applied this to their chosen website.
• Understand how to process and clean web-scraped data, ready for inclusion into the CPI calculation process.
• Have discussed further and explored coding to automate a pipeline for incorporating web scraped prices data into the NSO’s processes.
• Have a deeper appreciation of the methodological challenges for incorporating web scraped prices into CPI calculations, and how to overcome these.