How does the catalogue work?

Published

2025-04-28

Modified

2025-12-04

Researchers often struggle to use open data. They may find it hard to find open datasets for their research projects. If they find open datasets, they may struggle to easily understand them in order to confirm which works best for their research. Finally they may find it challenging to know how to work with or even how to cite the dataset.

The Price Statistics Open Data Catalogue helps find, assess, and understand how to access open datasets.

How the Price Statistics Open Data Catalogue works in a nutshell

The idea is simple. The catalogue lists open datasets relevant to the discipline and is searchable according to standard tags (more on tags below). It also provides basic information about each dataset that allows researchers to assess its relevance and know how to access it. Visually this is shown in Figure 1.

Figure 1: Basic idea of how we see the data catalogue working.

What are the tags used in the catalogue?

The catalogue uses the following tag structure to help categorize the datasets.

NoteTags may change

Tags will evolve and change over time, especially as new datasets are registered.

Data type

The first set of tags categorize the dataset on common types used in price statistics.

  • scanner
  • web-scraped
  • administrative
  • field-or-sample

Dataset Topic

The second set of tags focus on the uses of the data within price statistics. This relates most closely to categories of the area being measured as this implies the use of specific features in the data and specific set of methods.

  • electronics-and-applicances
  • housing
  • groceries-and-food
  • fuels
NoteLabels to be expanded over time

As new datasets are catalogued, this list will be expanded.

Where is the standalone data catalogue?

The standalone data catalogue can be found here. It is a static site hosted on a separate GitHub repository.

What does the catalogue not do?

As the catalogue does not store the dataset itself or make decisions on key aspects of the dataset, but simply describes it in detail. In other words, this catalogue is not a data repository. Catalogue records point to the wherever the dataset lives. If more than one version is available, the dataset version that is easiest for researchers to use is referenced.

NoteThis is an interim catalogue only!

This will likely not be the long-term stable data catalogue used in the discipline. The idea, however, it to start with this interim (and very simple open-source) catalogue, while the project investigates a more viable longer term solution.

Back to top