Over the last few years, and up until recently, the world of HPC and “accelerating” HPC application performance has seen products from the likes of ClearSpeed; IBM’s CELL Broadband Engine and FPGAs, but almost all of these options have fallen by the wayside. They were and continue to be incredibly difficult to programme, expensive, limited in scope; quite frankly, one needed a computer science degree to work with them. The dominance of the GPU continues unabated.
Growth of GPUs has been mainly driven by NVIDIA, although AMD has been clearly innovating in this area with its Accelerated Processing Unit [formally known as Fusion]. GPU programming is not easy either but early developers were able to take advantage of the matrix operation in OpenGL, which has now been overtaken by NVIDIA’s CUDA and which is followed close behind by OpenCL.
However, there is a new kid on the block, Intel’s Xeon Phi processor. It looks like it will turn everything on its head (I say new kid on the block, because I’ll conveniently skip over its birth place in the now defunct Intel Larabee project!).
Based on an x86 processor architecture, the Phi is ‘easy-enough’ to program [not least given all of the time and effort that Intel is putting into its compiler technologies]. Working with Phi is essentially the same as programming a Core i7 (or similar) processor that most people in technical computing have on their desktops, so there is no need to learn any new API. I really think Phi is going to rock NVIDIA’s boat! Simply, it is easier to program and the tools are available [if a developer is coding now, it is the same code and the same expertise].
Phi is already making an impact. Dell’s Stampede HPC system uses Phi and is currently at #7 in the Top500 (Nov 2013) list of supercomputers – impressive. Plus, the team at OCF has already sold its first Phi systems and even sold single Phi cards.
Is it the beginning of sea change?
Customers are about to get more Multicore Processing Unit [MPUs] choice. If a customer has little financial resource for application coding, time or energy, or if the code is very old [often the case in academia] they should be looking to buy a traditional CPU based server cluster. If, however, customers’ applications are suitable for parallelisation and they have the time and energy etc, using Phi or a GPU accelerator is the right route.
We’re committed to giving customers choice and flexibility; different MPUs will suit different customers, but watch this space, because I think Phi is going to make an impact quickly