Dataset

polish-rice-sugar-milk
Open Data Contract Standard v3.1.0
scanner groceries-and-food

Fundamentals

Basic information about the data contract

Name
Polish scanner rice, sugar and milk products
Version
1.0
Purpose
This is a collection of scanner data on the sale of rice, sugar and milk products in one of Polish supermarkets in the period from December 2024 to January 2026. Data was collected by Statistics Poland.

Note: When opening the dataset, skip the index column and use ";" as a separator. For example: Using Python: pd.read_csv("https://zenodo.org/records/18342253/files/dataRSM.csv?download=1", sep=";", index_col=False)
Usage
Open dataset, shared under "Creative Commons Attribution 4.0 International" licence on Zenodo.

Links:
How to cite
APA style:
Białek, J. (2026). A real scanner data set on sold rice, sugar and milk products (Version 1) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.18342253

Chicago style
Białek, Jacek. "A Real Scanner Data Set on Sold Rice, Sugar and Milk Products". Zenodo, January 21, 2026. https://doi.org/10.5281/zenodo.18342253.

Bibtex citation
Click here for raw form

Entity Relationship Diagram

Visual representation of data model relationships

                    erDiagram
	"**Dataset**" {
	time object
	prices object
	quantities object
	retID object
	description object
	retailer_code object
	EAN_code object
	category object
	subcategory object
}


                  

Schema

The data schema and structure

Dataset None
Property Business Name Type Required Description
time
Transaction date
object
No Dates of transactions (Year-Month-Day).
Example: '2024-12-21'
prices
Unit price
object
No Prices (unit values) of sold products [PLN].
Example: '3,05'
quantities
Quantity sold
object
No Quantities of sold products [items].
Example: '243'
retID
Retailer ID
object
No Unique codes identifying outlets/retailer sale points (data set contains 4 different retIDs).
Example: '02-936'
description
Product description
object
No Descriptions (labels) of sold products (data set contains 152 different descriptions in Polish).
Example: 'ryż biały 4x100g pp'
retailer_code
Product code
object
No Retailer codes for product definition (134 retailer codes).
Example: '246359'
EAN_code
EAN code
object
No EAN codes (bar codes) for product definition (138 EAN codes).
Example: '5904215121565'
category
Product category
object
No Product categories at the 6-digit COICOP level (4 categories in English).
Example: 'RICE'
subcategory
Product subcategory
object
No Product subcategories from 7-digit COICOP level (11 subcategories in English).
Example: 'white rice'
Created at 24 Feb 2026 05:02:50 UTC with Data Contract CLI v0.11.5
version: '1.0'
kind: DataContract
apiVersion: v3.1.0
id: polish-rice-sugar-milk
name: Polish scanner rice, sugar and milk products
tenant: Białek, Jacek (Data manager)
tags:
- scanner
- groceries-and-food
status: 'Draft
Target documentation level: 3' description: usage: "Open dataset, shared under \"Creative Commons Attribution 4.0 International\"\ \ licence on Zenodo.\n\n

\nLinks:
\n\n" purpose: "This is a collection of scanner data on the sale of rice, sugar and milk\ \ products in one of Polish supermarkets in the period from December 2024 to January\ \ 2026. Data was collected by Statistics Poland. \n

\nNote: When opening\ \ the dataset, skip the index column and use \";\" as a separator. For example:\n\ Using Python: pd.read_csv(\"https://zenodo.org/records/18342253/files/dataRSM.csv?download=1\"\ , sep=\";\", index_col=False)\n" limitations: 'APA style:
Białek, J. (2026). A real scanner data set on sold rice, sugar and milk products (Version 1) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.18342253

Chicago style
Białek, Jacek. "A Real Scanner Data Set on Sold Rice, Sugar and Milk Products". Zenodo, January 21, 2026. https://doi.org/10.5281/zenodo.18342253.

Bibtex citation
Click here for raw form ' domain: Consumer Price Statistics schema: - name: Dataset description: '' businessName: '' properties: - name: time description: 'Dates of transactions (Year-Month-Day).
Example: ''2024-12-21'' ' businessName: Transaction date logicalType: object examples: - '2024-12-21' - name: prices description: 'Prices (unit values) of sold products [PLN].
Example: ''3,05'' ' businessName: Unit price logicalType: object examples: - 3,05 - name: quantities description: 'Quantities of sold products [items].
Example: ''243'' ' businessName: Quantity sold logicalType: object examples: - '243' - name: retID description: 'Unique codes identifying outlets/retailer sale points (data set contains 4 different retIDs).
Example: ''02-936'' ' businessName: Retailer ID logicalType: object examples: - 02-936 - name: description description: 'Descriptions (labels) of sold products (data set contains 152 different descriptions in Polish).
Example: ''ryż biały 4x100g pp'' ' businessName: Product description logicalType: object examples: - ryż biały 4x100g pp - name: retailer_code description: 'Retailer codes for product definition (134 retailer codes).
Example: ''246359'' ' businessName: Product code logicalType: object examples: - '246359' - name: EAN_code description: 'EAN codes (bar codes) for product definition (138 EAN codes).
Example: ''5904215121565'' ' businessName: EAN code logicalType: object examples: - '5904215121565' - name: category description: 'Product categories at the 6-digit COICOP level (4 categories in English).
Example: ''RICE'' ' businessName: Product category logicalType: object examples: - RICE - name: subcategory description: 'Product subcategories from 7-digit COICOP level (11 subcategories in English).
Example: ''white rice'' ' businessName: Product subcategory logicalType: object examples: - white rice