Dataset

italian-food-supermarkets Data Contract Specification v1.1.0
web-scraped-dataset

Info

Information about the dataset

Title
Variations of Food Prices in Italian Supermarkets
Version
1.0
Description
This web scraped dataset includes retail prices for meat, fruit, and vegetable products collected over a period spanning from December 2020 to March 2023 (2 years and 4.5 months or 862 days).

Dataset facilitates comprehensive analyses, enabling exploration of regional price variations, comparisons across product categories, and time-series investigations into price dynamics within the Italian retail food market.
Owner
Sasso, Daniele, Luca Bacco, Luigi Palumbo, Juri Marcucci, Niccolo Salvini, Tiziana Laureti, and Luca Vollero

Terms

Terms and conditions of the dataset

Usage
Open dataset registered to Zenodo. Cite as:
APA:
Sasso, D., Bacco, L., Palumbo, L., Marcucci, J., Salvini, N., Laureti, T., & Vollero, L. (2025). Variations of Food Prices in Italian Supermarkets (1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.14927602

Chicago:
Sasso, Daniele, Luca Bacco, Luigi Palumbo, Juri Marcucci, Niccolo Salvini, Tiziana Laureti, and Luca Vollero. "Variations of Food Prices in Italian Supermarkets". Zenodo, February 25, 2025. https://doi.org/10.5281/zenodo.14927602.

BibTeX:
 @dataset{sasso_2025_14927602,
author = {Sasso, Daniele and
Bacco, Luca and
Palumbo, Luigi and
Marcucci, Juri and
Salvini, Niccolo and
Laureti, Tiziana and
Vollero, Luca},
title = {Variations of Food Prices in Italian Supermarkets},
month = feb,
year = 2025,
publisher = {Zenodo},
version = {1.0},
doi = {10.5281/zenodo.14927602},
url = {https://doi.org/10.5281/zenodo.14927602},
}
Limitations
None

Entity Relationship Diagram

Visual representation of data model relationships

                    erDiagram
	"**dataset**" {
	date object
	price float
	product_id integer
	store_id integer
	region object
	product object
	COICOP5 object
	COICOP4 object
}


                  

Data Model

The logical data model

dataset table
File Information:
- Format: CSV (.csv), UTF-8 encoded, comma-separated.
- Each row corresponds to one product observation at a specific store on a specific date.
- No missing values are present in the cleaned version.
date
object
Date of price collection

Dataset total coverage is 2 years, 4.5 months (2020-12-03 to 2023-04-14)

Dataset based on 841 distinct web scrapes, with an average 236 unique products captured each time per scraped store. The amount of web offers started to decline by mid-2022.
Example: 2020-12-03
price
float
Retail price in euros (EUR), using a decimal point (.)
Example: 1.99
product_id
integer
A unique identifier assigned to each product.

There are 2,361 unique products in the sample spread relatively equally across the COICOP categories.
Example: 2
store_id
integer
Anonymized unique identifier of the store where the price was recorded

There are 20 distinct stores covered by the dataset
Example: 2
region
object
Italian region where the store is located

There are 7 distinct regions of Italy covered by the dataset.
Example: Calabria
product
object
Full commercial name of the product, including quantity or weight
Example: arance navelina italia calibro 1.5 kg
COICOP5
object
Product classification at the 5-digit level based on the COICOP nomenclature.

There are 24 COICOP5 categories covered by the dataset. Classification was assigned via manual annotation and rule-based categorization using domain-specific keywords.
Example: Oranges
COICOP4
object
Higher-level COICOP category. Classification assigned via manual annotation and rule-based categorization using domain-specific keywords (tied to COICOP5).
Example: Fruit

Examples

Examples for models in the dataset

None csv
An example first 5 rows of the dataset.
date,price,product_id,store_id,region,product,COICOP5,COICOP4
2020-12-03,1.99,2,2,calabria,arance navelina italia calibro 1.5 kg,Oranges,Fruit
2020-12-03,2.48,2,3,lazio,arance navelina italia calibro 1.5 kg,Oranges,Fruit
2020-12-03,2.49,2,4,calabria,arance navelina italia calibro 1.5 kg,Oranges,Fruit
2020-12-03,1.99,2,5,calabria,arance navelina italia calibro 1.5 kg,Oranges,Fruit
2020-12-03,2.49,2,8,lazio,arance navelina italia calibro 1.5 kg,Oranges,Fruit 
Created at 01 Oct 2025 02:51:49 UTC with Data Contract CLI v0.10.35
dataContractSpecification: 1.1.0
id: italian-food-supermarkets
info:
  title: Variations of Food Prices in Italian Supermarkets
  version: '1.0'
  description: 'This web scraped dataset includes retail prices for meat, fruit, and
    vegetable products collected over a period spanning from December 2020 to March
    2023 (2 years and 4.5 months or 862 days).

Dataset facilitates comprehensive analyses, enabling exploration of regional price variations, comparisons across product categories, and time-series investigations into price dynamics within the Italian retail food market. ' owner: Sasso, Daniele, Luca Bacco, Luigi Palumbo, Juri Marcucci, Niccolo Salvini, Tiziana Laureti, and Luca Vollero contact: name: Sasso et al url: https://doi.org/10.5281/zenodo.14927602 terms: usage: ' Open dataset registered to Zenodo. Cite as:
APA:
Sasso, D., Bacco, L., Palumbo, L., Marcucci, J., Salvini, N., Laureti, T., & Vollero, L. (2025). Variations of Food Prices in Italian Supermarkets (1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.14927602

Chicago:
Sasso, Daniele, Luca Bacco, Luigi Palumbo, Juri Marcucci, Niccolo Salvini, Tiziana Laureti, and Luca Vollero. "Variations of Food Prices in Italian Supermarkets". Zenodo, February 25, 2025. https://doi.org/10.5281/zenodo.14927602.

BibTeX:
 @dataset{sasso_2025_14927602,
author = {Sasso, Daniele and
Bacco, Luca and
Palumbo, Luigi and
Marcucci, Juri and
Salvini, Niccolo and
Laureti, Tiziana and
Vollero, Luca},
title = {Variations of Food Prices in Italian Supermarkets},
month = feb,
year = 2025,
publisher = {Zenodo},
version = {1.0},
doi = {10.5281/zenodo.14927602},
url = {https://doi.org/10.5281/zenodo.14927602},
}
' models: dataset: description: ' File Information:
- Format: CSV (.csv), UTF-8 encoded, comma-separated.
- Each row corresponds to one product observation at a specific store on a specific date.
- No missing values are present in the cleaned version. ' type: table fields: date: type: object primary: false description: 'Date of price collection

Dataset total coverage is 2 years, 4.5 months (2020-12-03 to 2023-04-14)

Dataset based on 841 distinct web scrapes, with an average 236 unique products captured each time per scraped store. The amount of web offers started to decline by mid-2022. ' example: '2020-12-03' price: type: float primary: false description: Retail price in euros (EUR), using a decimal point (.) example: '1.99' product_id: type: integer primary: false description: 'A unique identifier assigned to each product.

There are 2,361 unique products in the sample spread relatively equally across the COICOP categories. ' example: '2' store_id: type: integer description: 'Anonymized unique identifier of the store where the price was recorded

There are 20 distinct stores covered by the dataset ' example: '2' region: type: object primary: false description: 'Italian region where the store is located

There are 7 distinct regions of Italy covered by the dataset. ' example: Calabria product: type: object primary: false description: Full commercial name of the product, including quantity or weight example: arance navelina italia calibro 1.5 kg COICOP5: type: object primary: false description: Product classification at the 5-digit level based on the COICOP nomenclature.

There are 24 COICOP5 categories covered by the dataset. Classification was assigned via manual annotation and rule-based categorization using domain-specific keywords. example: Oranges COICOP4: type: object primary: false description: Higher-level COICOP category. Classification assigned via manual annotation and rule-based categorization using domain-specific keywords (tied to COICOP5). example: Fruit examples: - type: csv description: An example first 5 rows of the dataset. data: 'date,price,product_id,store_id,region,product,COICOP5,COICOP4 2020-12-03,1.99,2,2,calabria,arance navelina italia calibro 1.5 kg,Oranges,Fruit 2020-12-03,2.48,2,3,lazio,arance navelina italia calibro 1.5 kg,Oranges,Fruit 2020-12-03,2.49,2,4,calabria,arance navelina italia calibro 1.5 kg,Oranges,Fruit 2020-12-03,1.99,2,5,calabria,arance navelina italia calibro 1.5 kg,Oranges,Fruit 2020-12-03,2.49,2,8,lazio,arance navelina italia calibro 1.5 kg,Oranges,Fruit ' links: Dataset: https://doi.org/10.5281/zenodo.14927602 tags: - web-scraped-dataset