D4.4 - Datasets Used to Train AI/ML Models

About

BeGREEN D4.4 provides the main relevant details of the datasets that have been used to train and test these AI/ML methodologies.
First, this document describes how the data is managed in the context of the BeGREEN project, covering aspects of data creation, processing, storage, sharing and security. In addition, a description of the Findability, Accessibility, Interoperability and Reusability (FAIR) principles is provided. These principles aim to provide general guidelines for scientific data management and stewardship.
This document also includes a description of a variety of datasets generated/used by the project. The characteristics of the datasets are presented in different tables which cover different aspects such as a brief description of the dataset, the main metrics that are collected and stored in the dataset, the way how the FAIR principles are considered, aspects of data security, archiving and preservation, and ethics and legal aspects.

Editor

Juan Sánchez-González (UPC)