Talks > 20-21/04/2016 Federico Silla

The remote GPU virtualization from the rCUDA point of view

GPUs are widely used to accelerate scientific applications, but their adoption in HPC clusters presents several drawbacks.First, in addition to increasing acquisition costs, using accelerators also increments maintenance and space costs. Second, energy consumption is also increased. Third, GPUs in a cluster may present a low utilization rate. In consequence, virtualizing the GPUs of the cluster is an appealing strategy to simultaneously dealing with all these drawbacks. Additionally, cluster throughput is increased whereas costs and energy consumption are reduced.

In this talk the remote GPU virtualization technique will be presented as well as its benefits. The talk will also introduce one of the frameworks that implement this virtualization mechanism: the rCUDA middleware. By using the rCUDA framework over a high-performance interconnect such as InfiniBand, the overhead of remote GPU virtualization is reduced to negligible values, with the net result that local and remote GPUs present similar performance. The rCUDA framework will be used as a case study to show that the remote GPU virtualization mechanism provides many benefits to clusters, such as doubling cluster throughput (in jobs/hour), reducing overall energy consumption by more than 40%, creating a flexible way of providing GPUs to virtual machines in a cloud computing facility, providing a large number of GPUs to a single-node application, etc.

Related Talks

Visit our forum

One of the main goals of this project is to motivate new initiatives and collaborations in the HPC field. Visit our forum to share your knowledge and discuss with other HPC experts!

About us

HPCKP (High-Performance Computing Knowledge Portal) is an Open Knowledge project focused on technology transfer and knowledge sharing in the HPC, AI and Quantum Science fields.

Promo HPCNow