There have been some interesting new developments in the High Performance Computing (HPC) market coming to light that will become more prominent in the coming months.
Particularly, containerisation, cloud and GPU based workloads are all going to dominate the HPC environment in 2020.
AMD’s new second generation EPYC ROME processor (CPU) has shown in benchmark testing to perform better as a single socket configuration than any other competitors’ dual socket. This new AMD CPU is proving to be very powerful and able to support GPU computing, with the ability to leverage new memory technologies, support PCIe Gen 4.0 and significantly increase bandwidth with 64GB/s.
AMD has agreements with cloud providers AWS and Azure to put its CPUs on their cloud platforms and promote the use of AMD CPUs for HPC. This interesting move reflects how a lot of our customers are now planning their next HPC cluster or supercomputer to include AMD in their infrastructure design to better support AI workflows.
AMD had previously been out of the HPC market for a period of time focussing primarily on consumer based chips, but this has dramatically shifted in recent months. With Intel opting not to support PCIe at present, it has given AMD a competitive advantage in the processor market. Lenovo has come a little later to the game, but we will see new developments with AMD later on in 2020.
Other new developments worth noting are Mellanox’s ConnectX-6n which is the first adapter to deliver 200Gb/s throughput, providing high performance Infiniband to support larger bandwidth capability. Also, the recently launched Samsung Non-Volatile Memory Express (NVMe) Gen4 SSD has significantly faster performance speeds, delivering double the speed of its Gen3 SSD.
Over the last year, we’ve seen a strong shift towards the use of cloud in HPC, particularly in the case of storage. Many research institutions are working towards a ‘cloud first’ policy, looking for cost savings in using the cloud rather than expanding their data centres with overheads, such as cooling, data and cluster management and certification requirements. There is a noticeable push towards using HPC in the cloud and reducing the amount of compute infrastructure on-premise. Following the AMD agreement with cloud AWS and Azure and their respective implementation of technologies, such as Infiniband, into these HPC cloud scenarios, it’s becoming more likely to be the direction universities are heading in 2020.
I don’t foresee cloud completely replacing large local HPC clusters in the near future, but for customers with variable workloads, on-premise HPC clusters could become smaller and closely tie into the public cloud to allow for peaks in utilisation. Additionally we’re increasingly seeing an uptake in using public cloud providers for ‘edge’ cases such as testing new technologies or spinning up environments for specific projects or events. With further understanding of the technologies involved and user requirements, most universities and research institutions are at least considering taking a hybrid approach.
One of the major downfalls of HPC in the cloud is the high cost of pulling the data back out of the cloud, which is understandably a cause for resistance in some organisations moving towards the cloud.
However, there are products coming onto the market from both NetApp and DDN that are ‘hybrid-ised’ for the public cloud, whereby you are able to upload some of your storage into the public cloud, process it and only download the changed content. This means only being charged for the retrieval of the new data that is required rather any more than is necessary.
Only a year ago, every storage vendor needed to have a cloud connector so organisations could move their data into the cloud and move it back in its entirety. The recognition by these storage vendors that organisations don’t want to store all their data on the cloud and only move small amounts of data in and out, will avoid huge expenditure of data retrieval and move the adoption of HPC in the cloud forward in 2020.