Dataset

italian-food-supermarkets
Open Data Contract Standard v3.1.0
web-scraped groceries-and-food

Fundamentals

Basic information about the data contract

Name
Variations of Food Prices in Italian Supermarkets
Version
1.0
Tenant
Sasso et al (2025) https://doi.org/10.5281/zenodo.14927602
Purpose
This web scraped dataset includes retail prices for meat, fruit, and vegetable products collected over a period spanning from December 2020 to March 2023 (2 years and 4.5 months or 862 days). Dataset facilitates comprehensive analyses, enabling exploration of regional price variations, comparisons across product categories, and time-series investigations into price dynamics within the Italian retail food market.
Usage
Open dataset, shared under "Creative Commons Attribution 4.0 International" licence on Zenodo.

Links:
How to cite
APA style:
Sasso, D., Bacco, L., Palumbo, L., Marcucci, J., Salvini, N., Laureti, T., & Vollero, L. (2025). Variations of Food Prices in Italian Supermarkets (1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.14927602

Chicago style
Sasso, Daniele, Luca Bacco, Luigi Palumbo, Juri Marcucci, Niccolo Salvini, Tiziana Laureti, and Luca Vollero. "Variations of Food Prices in Italian Supermarkets". Zenodo, February 25, 2025. https://doi.org/10.5281/zenodo.14927602.

Bibtex citation
Click here for raw form

Entity Relationship Diagram

Visual representation of data model relationships

                    erDiagram
	"**Dataset**" {
	date object
	price number
	product_id integer
	store_id integer
	region object
	product object
	COICOP5 object
	COICOP4 object
}


                  

Schema

The data schema and structure

Dataset None
File Information:
- Format: CSV (.csv), UTF-8 encoded, comma-separated.
- Each row corresponds to one product observation at a specific store on a specific date.
- No missing values are present in the cleaned version.
Property Business Name Type Required Description
date
Scrape date
object
No Date of price collection. Dataset total coverage is 2 years, 4.5 months (2020-12-03 to 2023-04-14) Dataset based on 841 distinct web scrapes, with an average 236 unique products captured each time per scraped store. The amount of web offers started to decline by mid-2022.
Example: '2020-12-03'
price
Product price
number
No Retail price in euros (EUR), using a decimal point (.).
Example: '1.99'
product_id
Product ID
integer
No A unique identifier assigned to each product. There are 2,361 unique products in the sample spread relatively equally across the COICOP categories.
Example: '2'
store_id
Store ID
integer
No Anonymized unique identifier of the store where the price was recorded. There are 20 distinct stores covered by the dataset.
Example: '2'
region
Region
object
No Italian region where the store is located. There are 7 distinct regions of Italy covered by the dataset.
Example: 'Calabria'
product
Product name
object
No Full commercial name of the product, including quantity or weight.
Example: 'arance navelina italia calibro 1.5 kg'
COICOP5
COICOP5 category
object
No Product classification at the 5-digit level based on the COICOP nomenclature. There are 24 COICOP5 categories covered by the dataset. Classification was assigned via manual annotation and rule-based categorization using domain-specific keywords.
Example: 'Oranges'
COICOP4
COICOP4 category
object
No Higher-level COICOP category. Classification assigned via manual annotation and rule-based categorization using domain-specific keywords (tied to COICOP5).
Example: 'Fruit'

Team

Team members and their roles

Sasso, Daniele, Luca Bacco, Luigi Palumbo, Juri Marcucci, Niccolo Salvini, Tiziana Laureti, and Luca Vollero
https://doi.org/10.5281/zenodo.14927602
Username Role Date In Date Out Comment
Created at 24 Feb 2026 05:02:51 UTC with Data Contract CLI v0.11.5
version: '1.0'
kind: DataContract
apiVersion: v3.1.0
id: italian-food-supermarkets
name: Variations of Food Prices in Italian Supermarkets
tenant: Sasso et al (2025) https://doi.org/10.5281/zenodo.14927602
tags:
- web-scraped
- groceries-and-food
status: 'Documentation
  level: 3'
description:
  usage: "Open dataset, shared under \"Creative Commons Attribution 4.0 International\"\
    \ licence on Zenodo.\n\n

\nLinks:
\n\n" purpose: 'This web scraped dataset includes retail prices for meat, fruit, and vegetable products collected over a period spanning from December 2020 to March 2023 (2 years and 4.5 months or 862 days). Dataset facilitates comprehensive analyses, enabling exploration of regional price variations, comparisons across product categories, and time-series investigations into price dynamics within the Italian retail food market. ' limitations: 'APA style:
Sasso, D., Bacco, L., Palumbo, L., Marcucci, J., Salvini, N., Laureti, T., & Vollero, L. (2025). Variations of Food Prices in Italian Supermarkets (1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.14927602

Chicago style
Sasso, Daniele, Luca Bacco, Luigi Palumbo, Juri Marcucci, Niccolo Salvini, Tiziana Laureti, and Luca Vollero. "Variations of Food Prices in Italian Supermarkets". Zenodo, February 25, 2025. https://doi.org/10.5281/zenodo.14927602.

Bibtex citation
Click here for raw form ' domain: Consumer Price Statistics schema: - name: Dataset description: "File Information:
\n - Format: CSV (.csv), UTF-8 encoded, comma-separated.
\n\ \ - Each row corresponds to one product observation at a specific store on\ \ a specific date.
\n - No missing values are present in the cleaned version.\n" businessName: '' properties: - name: date description: 'Date of price collection. Dataset total coverage is 2 years, 4.5 months (2020-12-03 to 2023-04-14) Dataset based on 841 distinct web scrapes, with an average 236 unique products captured each time per scraped store. The amount of web offers started to decline by mid-2022.
Example: ''2020-12-03'' ' businessName: Scrape date logicalType: object examples: - '2020-12-03' - name: price description: 'Retail price in euros (EUR), using a decimal point (.).
Example: ''1.99'' ' businessName: Product price logicalType: number examples: - '1.99' - name: product_id description: 'A unique identifier assigned to each product. There are 2,361 unique products in the sample spread relatively equally across the COICOP categories.
Example: ''2'' ' businessName: Product ID logicalType: integer examples: - '2' - name: store_id description: 'Anonymized unique identifier of the store where the price was recorded. There are 20 distinct stores covered by the dataset.
Example: ''2'' ' businessName: Store ID logicalType: integer examples: - '2' - name: region description: 'Italian region where the store is located. There are 7 distinct regions of Italy covered by the dataset.
Example: ''Calabria'' ' businessName: Region logicalType: object examples: - Calabria - name: product description: 'Full commercial name of the product, including quantity or weight.
Example: ''arance navelina italia calibro 1.5 kg''' businessName: Product name logicalType: object examples: - arance navelina italia calibro 1.5 kg - name: COICOP5 description: 'Product classification at the 5-digit level based on the COICOP nomenclature. There are 24 COICOP5 categories covered by the dataset. Classification was assigned via manual annotation and rule-based categorization using domain-specific keywords.
Example: ''Oranges'' ' businessName: COICOP5 category logicalType: object examples: - Oranges - name: COICOP4 description: 'Higher-level COICOP category. Classification assigned via manual annotation and rule-based categorization using domain-specific keywords (tied to COICOP5).
Example: ''Fruit'' ' businessName: COICOP4 category logicalType: object examples: - Fruit team: name: Sasso, Daniele, Luca Bacco, Luigi Palumbo, Juri Marcucci, Niccolo Salvini, Tiziana Laureti, and Luca Vollero description: https://doi.org/10.5281/zenodo.14927602