Published April 30, 2026 | Version 36f93b53 (v1.0.0)
Dataset

EXA4MIND Example Workflows - Airflow: Healthcare Analytics Pipeline

Description

The EXA4MIND (Extreme Analytics for MINing Data Spaces, https://exa4mind.eu) Project brings together Data Analytics, Supercomputing and European Data Ecosystems to enable Extreme Data applications. It provides the EXA4MIND/Extreme Data Database (EDD) platform of flexibly-deployable tools and data backends for researchers and developers in science, SMEs and industry to implement their use cases, leveraging the most powerful and appropriate computing and data backends available.

The platform is available open-source and comes with example data-analytics workflows, including sample input/output data.

The workflow published here is a Dockerized data analytics pipeline built with Apache Airflow, Kafka, PostgreSQL, and Matplotlib. The workflow ingests a CSV dataset of synthetic public-health records, streams the data through Kafka, stores it in PostgreSQL, derives age-based features, and generates demographic visualizations for downstream analysis.

Availability: Please find the workflow under the GitLab repository link (related identifiers / has part).

This work received the support of the EXA4MIND project, funded by the European Union´s Horizon Europe Research and Innovation Programme, under Grant Agreement N° 101092944. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Commission. Neither the European Union nor the granting authority can be held responsible for them.
We thank the authors of all open-source work re-used or leveraged upon here.

Additional details

Funding

European Commission
EXA4MIND - EXtreme Analytics for MINing Data spaces 101092944