nso-stats-fetcher

Comparing inflation data publishing across countries

Disclaimer: this is a draft worknote

We looked at how countries publish statistical data, and how that impacts people who need to work across multiple countries’ datasets. To highlight the issue, in this document we look at inflation data from different countries.

Below is a graph of inflation over time in Argentina, Ireland, Japan, Mexico, Nigeria, the Philippines, the UK and South Africa (also as an interactive plot). It shows the monthly consumer price index, year-on-year in these countries. The underlying data for this graph is available in standardised, simple CSVs for each country.

Inflation in different countries over time

However, the original data we collected from national statistical offices was stored in many styles and formats. It was often hard to find the data we wanted, and then it took more time to clean this data and get it in this standardised, machine-readable format. We encountered hidden JSON files, screenshots of Excel tables, tables inside PDFs and other formats. Every national statistics website seemed to have its own specific approach to publishing open data. We also tried other countries but couldn’t find the specific inflation data we wanted.

This is a good example of how the lack of consistency and interoperability between similar datasets make the jobs of organisations like data journalists, researchers and fact checkers unnecessarily hard. Each individual site does a fairly good job at publishing the data, but collating and comparing individual knowledge of each national statistics office remains very time consuming.

Two reasons we collected this data

Full Fact are developing a robo-checking tool, which automates fact checks certain claims. One of the topics it fact-checks is inflation. To say whether a claim is true or not it needs a clean, reliable set of inflation data. So, we created code which fetches and standardises this data from mutiple countries and puts it all in one location within a Github repo.

More broadly, at the Open Data Institute, we want a world where data works for everyone. National statistics are extremely important open data. With better-published national statistics, it means more people can use them for better insights and decisions. We hope this piece adds to that discussion.

Defining national statistics

In all countries, there are many organisations that publish national statistics data. These include government departments, research institutes, health services, survey companies and international groups. All the statistics published by these creates the national statistical system.

One organisation usually operates as the main hub for national statistical data in a country. These are known as national statistical offices (NSOs).

NSOs publish statistical data on topics like health, the economy, education and housing. People in the public and private sectors use this data to observe what is happening in the country and to plan ahead. There are NSOs in almost every country on earth. Nearly every country has one main NSO, but in some, such as the USA, the role is split across multiple organisations.

Defining inflation data

There are other places much more qualified than here to define inflation. But, in short, there are a few types. Consumer Price Index (CPI) is the weighted average of a typical basket of goods. CPIH is another, used by the UK’s ONS, which includes housing. There is also the Producer Price Index, which measures how much domestic producers pay for goods. And there’s the Retail Price Index which measures retail goods and services.

When you see “inflation” in the news, they’re usually talking about CPI. And therefore, this is most important for fact checking. So we focus on how we got the CPI for each country.

We should mention, we are not judging how these inflation measures are calculated or which country’s measure is the best. We are focused on how these numbers appear on NSO websites, and if they are easy to access and use.

Inflation data for each country

These are the steps we undertook to get the data from each country’s NSO website. It’s quite detailed but our aim here is to emphasise how varied and sometimes complex it can be to get this data from NSOs.

Argentina

Ireland

Japan

Mexico

Nigeria

Philippines

South Africa

United Kingdom

Countries we couldn’t find the data for

India

Indonesia

Pakistan

Improving national statistics publishing

All NSOs publish statistical data about their country. But the quantity and quality of data varies greatly between them. This is very understandable as every country has different finances, resources and society.

However, there exist good practices and standards in open data publishing that every NSO, no matter the size of budget, can try to achieve. We’re not saying every NSO needs to build large data platforms, but simple, easy-to-use data formats exist which can really help data users.

Open principles for data publishing are partly about following open standards and partly about thinking how the data can best be designed for other people to be able to reuse. The Office of National Statistics in the UK produced a set of principles for what this can practically mean. In these they outline a need to consider publishing information so that it performs well on other sites and services. This is similar in many ways to the use of the Claim Review format in the fact checking world. For statistics this might be about how well it displays in the Google dataset explorer or in search results. This wider theme of making data part of the web is a key component of making data available in ways that support the processes of fact checking. By making access easier to the data, always presenting it in context and designing systems with reuse at the core.

For more on this topic, please see our guide on open data publishing, which contains links to lots of excellent tools and advice. Also, see our guide on National Statistics for Fact Checking.