Talks > 21-22/06/2018 Paolo Di Tommaso

Portable containers orchestration at scale with Nextflow

Reproducibility has become one of the most pressing issues in biology and many other computational-based research fields. This impasse has been fuelled by the combined reliance on increasingly complex data analysis methods and the exponential growth of big data. When considering the installation, deployment, and maintenance of computational data-analysis pipelines, an even more challenging picture emerges due to the lack of community standards. Moreover, the effect of limited standards on reproducibility is amplified by the very diverse range of computational platforms and configurations on which these applications are expected to be applied (workstations, clusters, HPC, clouds, etc.).

Software containers are gaining consensus as a solution to the problem of reproducibility of computational workflows. However, the orchestration of large containerised workloads at scale and in a portable manner across different platforms and runtime pose new challenges.

This presentation will give an introduction of Nextflow, a pipeline orchestration tool that has been designed to address exactly these issues. Nextflow is a computational environment which provides a domain specific language (DSL), meant to simplify the implementation and the deployment of complex large-scale containerised workloads in a portable and replicable manner. It allows the seamless parallelization and deployment of any existing application with minimal development and maintenance overhead, irrespective of the original programming language.


Related Talks

Visit our forum

One of the main goals of this project is to motivate new initiatives and collaborations in the HPC field. Visit our forum to share your knowledge and discuss with other HPC experts!

About us

HPCKP (High-Performance Computing Knowledge Portal) is an Open Knowledge project focused on technology transfer and knowledge sharing in the HPC, AI and Quantum Science fields.

Promo HPCNow