My time with OCF coincides with its 10-year anniversary. In terms of a retrospective, I want to discuss GPGPU and visualisation technologies. I’ve always been interested in that area and a lot has certainly changed (although not as fast as I had anticipated).
To set the scene, I have long been enamoured by the raw computation power of the Geometry Engines (GE) and Raster Managers (RM) in the SGI Onyx line of supercomputers. Programming these beasts was a matter of merging your programme into the image processing and matrix operations of the Open Graphics Library (OpenGL), which was the beginning of the GPGPU phenomenon. From that initial activity two real branches emerged, CUDA and Stream (the latter was arguably the better development environment but CUDA has emerged as the most successful). Ten years later, I think its safe to say that with CUDA (and its near relative OpenCL) the act of programming GPUs has become increasingly easy, but to do it well remains the hard part!
Where are the applications?
For the past 20 years, memory on CPU systems has increased. If you take a rule of thumb of 2GB per core, modern compute nodes now have 24GB of memory (assuming a 6 core CPU). By contrast, if you take a GPU node with 2 GPUs, you have let’s say 6GB on the GPUs. If all memory on the GPU was available and not in various banks, how do you fit 24GB of data into 12GB? I think that this is one of the major factors preventing applications becoming available for GPUs – basically you are going from a system where you have huge amounts of memory available and trying to move the compute to a system that only has a very limited amount available, amongst other things. You can read more on my views about GPU applications on my previous post.
How has the hardware changed? I think we’ve had an evolution rather than a revolution, as with all things in the computing arena. With the demise of SGI we have lost the age of the monolithic graphics system, replaced in the mainstream with two graphics card manufacturers, Nvidia and AMD. Manufacturing processes have shrunk and we are fitting more and more transistors on to the GPU. Will we have a RISC vs CISC argument soon?
Back in the day, SGI Infinity Reality graphics system just about peaked at 10 million triangles a second. Today, it is fair to say your average games GPU would not even break sweat pushing out 10 million triangles a second! And with all of the bells and whistles that would have a £1M Onyx falling over in complete exhaustion! Multi displays would cost you another graphics pipe. Geometry Engines (GE) and Raster Managers (RM) would take up a full rack at the very least, replaced today by decent PC that would set you back only £300!
GPU + CPU union
One of the ways forward currently making progress is the union of CPU and GPU, in the likes of AMDs Fusion line and Intel’s Sandy Bridge. It won’t be too far away when we have heterogeneous computers on chips, with various different processor types sharing the same memory – cell processor anyone?
What about the next 10 years?
It’s hard to predict, I think that we will see the emergence of a CPU/GPU system on chip that will in a way solve the problems of memory accessing for the CPU and GPU, and more of the same – smaller, faster, better!