As cloud-native technologies continue to advance, researchers and engineers in high-performance computing (HPC) are beginning to explore how these technologies can be used to build more scalable, reliable, and efficient supercomputers. In this talk, we will explore some of the latest technologies being used in cloud-native supercomputing, and discuss how they are being integrated into Kubernetes, the de facto standard for container orchestration.
One of the most exciting new technologies for cloud-native supercomputing is dynamic resource allocation (DRA), which has recently been merged into Kubernetes upstream. DRA allows users to dynamically allocate resources based on their application’s requirements, making it possible to optimize resource usage and reduce costs.
Another important technology is the container device interface (CDI), which provides a standardized way for containers to access devices like GPUs and FPGAs. CDI is critical for running high-performance computing workloads on cloud-native infrastructure, as it allows applications to take advantage of specialized hardware.
We will also discuss Kueue, a Kubernetes-native queuing controller that enables users to easily manage complex workloads with dependencies and priorities. Finally, we will provide updates on the MPI operator, which automates the deployment and scaling of MPI applications in Kubernetes clusters.
By leveraging these new technologies, cloud-native supercomputing promises to unlock new levels of scalability and efficiency for HPC workloads, making it possible to tackle even the most demanding scientific and engineering challenges.
Related Talks
About us
HPCKP (High-Performance Computing Knowledge Portal) is an Open Knowledge project focused on technology transfer and knowledge sharing in the HPC, AI and Quantum Science fields.