Dataset
italian-food-supermarkets
Open Data Contract Standard v3.1.0
web-scraped
groceries-and-food
Fundamentals
Basic information about the data contract
- Name
- Variations of Food Prices in Italian Supermarkets
- Version
- 1.0
- Status
- Documentation level: 3
- Tenant
- Sasso et al (2025) https://doi.org/10.5281/zenodo.14927602
- Purpose
- This web scraped dataset includes retail prices for meat, fruit, and vegetable products collected over a period spanning from December 2020 to March 2023 (2 years and 4.5 months or 862 days). Dataset facilitates comprehensive analyses, enabling exploration of regional price variations, comparisons across product categories, and time-series investigations into price dynamics within the Italian retail food market.
- Usage
- Open dataset, shared under "Creative Commons Attribution 4.0 International" licence on Zenodo.
Links:
- Dataset: https://doi.org/10.5281/zenodo.14927602
- Dataset paper: https://doi.org/10.1016/j.dib.2025.112089
- How to cite
- APA style:
Sasso, D., Bacco, L., Palumbo, L., Marcucci, J., Salvini, N., Laureti, T., & Vollero, L. (2025). Variations of Food Prices in Italian Supermarkets (1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.14927602
Chicago style
Sasso, Daniele, Luca Bacco, Luigi Palumbo, Juri Marcucci, Niccolo Salvini, Tiziana Laureti, and Luca Vollero. "Variations of Food Prices in Italian Supermarkets". Zenodo, February 25, 2025. https://doi.org/10.5281/zenodo.14927602.
Bibtex citation
Click here for raw form
Entity Relationship Diagram
Visual representation of data model relationships
erDiagram
"**Dataset**" {
date object
price number
product_id integer
store_id integer
region object
product object
COICOP5 object
COICOP4 object
}
Schema
The data schema and structure
|
Dataset
None
File Information:
- Format: CSV (.csv), UTF-8 encoded, comma-separated. - Each row corresponds to one product observation at a specific store on a specific date. - No missing values are present in the cleaned version. |
||||
|---|---|---|---|---|
| Property | Business Name | Type | Required | Description |
|
date
|
Scrape date |
object
|
No | Date of price collection.
Dataset total coverage is 2 years, 4.5 months (2020-12-03 to 2023-04-14)
Dataset based on 841 distinct web scrapes, with an average 236 unique products captured each time per scraped store. The amount of web offers started to decline by mid-2022. Example: '2020-12-03' |
|
price
|
Product price |
number
|
No | Retail price in euros (EUR), using a decimal point (.). Example: '1.99' |
|
product_id
|
Product ID |
integer
|
No | A unique identifier assigned to each product.
There are 2,361 unique products in the sample spread relatively equally across the COICOP categories. Example: '2' |
|
store_id
|
Store ID |
integer
|
No | Anonymized unique identifier of the store where the price was recorded.
There are 20 distinct stores covered by the dataset. Example: '2' |
|
region
|
Region |
object
|
No | Italian region where the store is located.
There are 7 distinct regions of Italy covered by the dataset. Example: 'Calabria' |
|
product
|
Product name |
object
|
No | Full commercial name of the product, including quantity or weight. Example: 'arance navelina italia calibro 1.5 kg' |
|
COICOP5
|
COICOP5 category |
object
|
No | Product classification at the 5-digit level based on the COICOP nomenclature.
There are 24 COICOP5 categories covered by the dataset. Classification was assigned via manual annotation and rule-based categorization using domain-specific keywords. Example: 'Oranges' |
|
COICOP4
|
COICOP4 category |
object
|
No | Higher-level COICOP category. Classification assigned via manual annotation and rule-based categorization using domain-specific keywords (tied to COICOP5). Example: 'Fruit' |
Team
Team members and their roles
Sasso, Daniele, Luca Bacco, Luigi Palumbo, Juri Marcucci, Niccolo Salvini, Tiziana Laureti, and Luca Vollero
https://doi.org/10.5281/zenodo.14927602
| Username | Role | Date In | Date Out | Comment |
|---|
Created at 24 Feb 2026 05:02:51 UTC with Data Contract CLI v0.11.5