The most powerful feature of cloud computing is its capacity to transfer computing as a 5th utility after water, electricity, gas, and telephony. Gartner CIO survey 2011 predicts that 23% of computing activity would never move to cloud, even though 43% will be moved to cloud by 2015 and another 31% would be moved by 2020. Some of the High Performance Computing (HPC) applications will be among those 23% that would never move to cloud. Ian Foster et al have nailed it on the head while pointing out the core reason for it as follows:
“The one exception that will likely be hard to achieve in cloud computing (but has had much success in Grids) are HPC applications that require fast and low latency network interconnects for efficient scaling to many processors.”
However the future for high performance computing in cloud is not that bleak. It may be true that some HPC applications whose parallel tasks are too interdependent (and not embarrassingly parallel) may find it difficult to take off on a generic public or even a hybrid cloud. But there would emerge specialized clouds and providers with new tools and technology to enable most of those applications on cloud with acceptable level of speed and efficiency. Science Clouds – supported by Nimbus project – is an early indication of that trend.
Research at the Australian National University has produced a SOA middleware – ANU-SOAM – that is intended to enable high performance outcomes for not so embarrassingly parallel scientific applications. The execution of such applications can be considered as a series (of generations) of executions of a set of pure computation tasks; the execution of each set is separated by a phase of communication. All tasks within a set can thus execute independently. ANU-SOAM supports such a model by introducing a Data Service to implement the communication phase. The Common Data – one-dimensional or two dimensional array – in the Data Service can be accessed, modified and synchronized (add, get, put and sync) by the compute processes (service instances – SI) and can be used for the successive generations of tasks without communicating the updates back to the host process (client). This helps reduce communications and resulting overheads. Early experiments show that this programming model is effective in harnessing cloud-computing resources over slow networks like the Internet comparing to other existing paradigms.