Once upon a time a tech geek was a near-commodity. When we built a High Performance Computer (HPC) or ‘server cluster’ we knew that our customers would have someone in-house, possibly a user, who would manage and maintain the system. HPC integrations were most common in the academic sector, so there was any number of budding computer engineers keen to fiddle, refine and improve the system in their spare time. It was a personal challenge for many users.
Move to 2013 and it is boom-time for the design and integration of server clusters – they are no longer the reserve of a few select universities with engineering and design aspirations. The UK Government recently invested £156m into UK R&D of which a significant proportion has ended up in new academic server clusters, which are used by a large range of disciplines from engineering, science and mathematics to cosmology, archaeology and economics.
Durham, Warwick, Southampton, Aberdeen, Leicester, and Bristol universities have all built or upgraded systems in the last few months. Big businesses have got in on the act too and, space, cooling and energy permitting, the ever decreasing cost of hardware means that even SMBs can now afford to purchase their own server cluster.
However, this boom presents a few challenges:
- Our new breed of customers, specifically the users, do not want to fiddle, refine and improve the system, they want a ready made utility compute service so they can focus on their primary business
- Full-time HPC managers are few and far between, they are either very ‘green’ or come with 25-years experience and are very expensive. The University of Central Lancashire [UCLan] for example recently spent several months on a fruitless search for a full-time HPC manager after its incumbent left on short notice
- IT managers cannot stretch their skills to managing a server cluster. We tend to find that IT departments are abound with wall-to-wall Microsoft experts – who can maintain a normal IT infrastructure, but can’t stretch to a Linux server cluster environment. They lack specific skills in configuring parallel file systems [such as IBM’s GPFS]; they have no compiling skills to get maximum power from CPUs or GPUs; they cannot port codes to parallel programming languages such as MPI and they do not understand the weird and wonderful scientific applications that are commonly used. Fine-tuning a cluster is like your local mechanic working on an F1 racecar, its similar, but needs quite a few extra skills.
Daresbury Laboratories is doing some great work to help develop HPC skills and there is a GPFS user group in operation, of which we coordinate, which is successfully helping to promote learning amongst file system users. However, this development programme could take several years to flood the industry with a new brood of tech-geeks.
In the short term, lack of HPC skills is changing the face of supercomputing and leading to a boom in managed services, hosted clusters and HPC-on-demand, for example:
- After an unsuccessful attempt to recruit a full-time HPC manager, UCLan eventually opted for remote managed services to operate its cluster. Graham Lee, Head of IT Infrastructure Management said at the time: “Our HPC system can now be remotely managed and is presently returning 98 per cent availability of service”.
- SMB engineering firm Engys recently purchased a hosted cluster, i.e. a cluster owned by the business, but held in a datacentre by the supplier and accessed remotely. Although primary driven by a lack of space and cooling in its own office, Francisco Campos, Director of Operations at Engys suggested: “A hosted cluster gives us freedom, we do not have to waste our own time and effort maintaining the cluster; we don’t need cluster skills”.
- BHR Group, a fluid engineering consultancy, now uses a HPC-on-demand service which enables them to cope with peaks in demand for compute capacity. Dr David Kelsall, senior consultant at BHR Group suggests: We use a HPC-on-demand service, it is a very easy, with an uncomplicated and simple structure that doesn’t require any previous HPC knowledge to operate”.
I’ve often suggested that the HPC industry is like a winding road with new technologies, innovations and suppliers bending and shaping its path. However, now for the first time, driven by a lack of skills I think organisations have four-roads to choose from: an in-house cluster if skills are available, a managed service, a hosted cluster or an HPC-on-demand service. It’s an exciting time to be in the industry as supercomputing power becomes more accessible.