Trustworthy, Energy-Aware federated DAta Lakes along the computing continuum.
The TEADAL Toolbox comprises 2 main platforms: TEADAL Data Lake Platform and TEADAL Trustworthy Data Lake Federation.
TEADAL Data Lake Platform
The Data Lake Platform is the materialisation of the main concepts of the TEADAL project. The data lake platform is a single node managed by a single organisation. Focusing on the modules composing the data lake in Fig. 1, (boxes with a star indicates the innovations in data lakes brought by the TEADAL project) the data lake administrator can take advantage of a privacy/confidentiality governance policy definition tool to define data storage, movement and processing policies with respect to the owned/managed resources along the continuum, thus considering resources at the edge (where usually data are produced), compute and storage systems on premises, as well as cloud resources.
TEADAL Trustworthy Data Lake Federation
The proposed data platform will enable the creation of a federation of data lakes (see Fig. 2) which fosters the data sharing among organizations where the trust among the parties is not provided, as it usually happens, by a centralized third-party which must be accepted but all the members, but by a new kind of trust which relies on blockchain/DLT techniques (e.g., smart contracts, secure oracles).
In the proposed approach the trust among the members of the federation is built on top of an appropriate, to be selected (e.g., permissioned/permissionless, proof-of-work/proof-of-stake) blockchain/DLT which is easily accessible and used to specify – via smart contracts – the agreements among members about data sharing: e.g., which data, which format, for which purpose.