GreenDB is a publicly available database of sustainable products, scraped from European online shops on a weekly basis. As proxy for the products’ sustainability, it relies on sustainability labels, which are evaluated by experts. The GreenDB schema extends the well-known Schema.org Product definition and is compatible with standardized fine grained product taxonomies such as GS1.
| 2023-11-24 | last updated |
| 19,992,756 | pages scraped |
| 2,711,568 | unique products |
| 342,530 | unique products with credible sustainability labels |
| 41 | product categories |
| column name | timestamp | url | source | merchant | country | category | name | description | brand | sustainability_labels | price | currency | image_urls | gender | consumer_lifestage | colors | sizes | gtin | asin |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| data type | timestamp | text | text | text | text | text | text | text | text | array[text] | numeric | text | array[text] | text | text | array[text] | array[text] | int | text |
| nullable | no | no | no | no | no | no | no | no | no | no | no | no | no | yes | yes | yes | yes | yes | yes |
| column name | id | timestamp | name | description | cred_credibility | eco_chemicals | eco_lifetime | eco_water | eco_inputs | eco_quality | eco_energy | eco_waste_air | eco_environmental_management | social_labour_rights | social_business_practice | social_social_rights | social_company_responsibility | social_conflict_minerals |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| data type | text | timestamp | text | text | int4 | int4 | int4 | int4 | int4 | int4 | int4 | int4 | int4 | int4 | int4 | int4 | int4 | int4 |
| nullable | no | no | no | no | yes | yes | yes | yes | yes | yes | yes | yes | yes | yes | yes | yes | yes | yes |