It was pointed out to me recently that Intel’s Phi and Nvidia’s Tesla GPU are programmed in the same way via OpenACC directives. OpenACC enables developers to insert programming pragmas around specific parts of code that need to be offloaded onto the MultiCore Processing Unit (MPU) such as loops and matrix operations. Those developers familiar with OpenMP will find the Open ACC very familiar [because the creators of the OpenACC API are all members of the OpenMP Working Group on Accelerators].
Whilst Tesla and Phi share this common programming environment for offloading work, where an application resides on the CPU and where it offloads the ‘heavy-lifting’ to the MPU, there are differences. Phi can actually run Linux on its cores so developers can also actually program it directly in a number of ways, they can:
- Recompile source code to actually run code directly on the Phi itself
- Use each processor in the Phi as a node in an MPI cluster or as a device that contains a cluster of MPI nodes
- Use the Phi with optimised libraries like the Intel MKL (surprisingly enough!) which is a again similar to the Tesla
Take an edge?
The Phi and Tesla share much functionality in many ways, but the added advantage of Intel’s Phi is that developers can directly program the cores if required. If developers already have an application that runs MPI they can program the Phi by recompiling the application to take advantage of the Phi directly as a MPI cluster, for instance. It should be relatively easy (I should probably say straight forward!) to move a CUDA application from a GPU to Phi (but there are differences that you should be aware of!)
Going back to the headline, is this VHS Vs. Betamax? Nvidia has created a huge amount of support and momentum for CUDA and its GPUs, Intel is playing catch up for certain, but does it have an edge?