Talks > 08/05/2024 Jordi Blasco

Leveraging HPC Infrastructure for AI Workloads – The Role of Kubernetes in AI and HPC

As the demand for AI workloads surges, there’s a pressing need to adapt existing High-Performance Computing infrastructures to accommodate the requirements of this new emerging user community. While many HPC users already run AI workloads, the influx of new AI users with little exposure to HPC, demands a more accessible and interactive platform. While Kubernetes is emerging as a solution for AI workloads, significant challenges arise as soon as the AI workloads need to scale.

Addressing the needs of this growing user base goes beyond standard Kubernetes solutions. This talk delves into the challenges and solutions to integrating AI workloads with HPC infrastructures. It discusses leveraging existing HPC solutions, such as Warewulf for provisioning Kubernetes alongside Slurm clusters, enabling Slurm to balance the resources effectively for both Kubernetes and traditional HPC workloads based on the load. This talk will also introduce the work done by HPCNow! to enhance performance and efficiency for AI workloads on Kubernetes.

Looking towards the future, the talk also examines the development of more mature and suitable solutions for AI and HPC on Kubernetes.


Related Talks

Visit our forum

One of the main goals of this project is to motivate new initiatives and collaborations in the HPC field. Visit our forum to share your knowledge and discuss with other HPC experts!

About us

HPCKP (High-Performance Computing Knowledge Portal) is an Open Knowledge project focused on technology transfer and knowledge sharing in the HPC, AI and Quantum Science fields.

Promo HPCNow