Telephone: 0114 2572 200
Email: info@ocf.co.uk

OCF achieves Elite Partner status with NVIDIA

OCF has successfully achieved Elite Partner status with NVIDIA® for Accelerated Computing, becoming only the second business partner in Northern Europe to achieve this level.

Awarded in recognition of OCF’s ability and competency to integrate a wide portfolio of NVIDIA’s Accelerated Computing products including TESLA® P100 and DGX-1™, the Elite Partner level is only awarded to partners that have the knowledge and skills to support the integration of GPUs, as well as the industry reach to support and attract the right companies and customers using accelerators.

“For customers using GPUs, or potential customers, earning this specialty ‘underwrites’ our service and gives them extra confidence that we possess the skills and knowledge to deliver the processing power to support their businesses,” says Steve Reynolds, Sales Director, OCF plc. “This award complements OCF’s portfolio of partner accreditations and demonstrates our commitment to the vendor.”

OCF has been a business partner with NVIDIA for over a decade and has designed, built, installed and supported a number of systems throughout the UK that include GPUs. Most recently, OCF designed, integrated and configured ‘Blue Crystal 4’, a High Performance Computing (HPC) system at the University of Bristol, which includes 32 nodes with 2 NVIDIA Tesla P100 GPUs accelerators each.

In addition, as a partner of IBM and NVIDIA via the OpenPOWER Foundation, OCF has supplied two IBM® Power Systems™ S822LC for HPC systems, codenamed ‘Minsky’, to Queen Mary University of London (QMUL).

The two systems, which pair a POWER8 CPU with 4 NVIDIA Tesla P100 GPU accelerators, are being used to aid world-leading scientific research projects as well as teaching, making QMUL one of the first universities in Britain to use these powerful deep learning machines. The university was also the first in Europe to deploy a NVIDIA DGX-1 system, described as the world’s first AI supercomputer in a box.

Read More »

OCF deliver new 600 Teraflop HPC machine for University of Bristol

For over a decade the University of Bristol has been contributing to world-leading and life changing scientific research using High Performance Computing (HPC), having invested over £16 million in HPC and research data storage. To continue meeting the needs of its researchers working with complex and large amounts of data, they will now benefit from a new HPC machine, named BlueCrystal 4 (BC4).

Designed, integrated and configured by the HPC, storage and data analytics integrator OCF, BC4 has more than 15,000 cores making it the largest UK University system by core count and a theoretical peak performance of 600 Teraflops.

Over 1,000 researchers in areas such as paleobiology, earth science, biochemistry, mathematics, physics, molecular modelling, life sciences, and aerospace engineering will be taking advantage of the new system. BC4 is already aiding research into new medicines and drug absorption by the human body.

“We have researchers looking at whole-planet modelling with the aim of trying to understand the earth’s climate, climate change and how that’s going to evolve, as well as others looking at rotary blade design for helicopters, the mutation of genes, the spread of disease and where diseases come from,” said Dr Christopher Woods, EPSRC Research Software Engineer Fellow, University of Bristol. “Early benchmarking is showing that the new system is three times faster than our previous cluster – research that used to take a month now takes a week, and what took a week now only takes a few hours. That’s a massive improvement that’ll be a great benefit to research at the University.”

BC4 uses Lenovo NeXtScale compute nodes, each comprising of two 14 core 2.4 GHz Intel Broadwell CPUs with 128 GiB of RAM. It also includes 32 nodes of two NVIDIA Pascal P100 GPUs plus one GPU login node, designed into the rack by Lenovo’s engineering team to meet the specific requirements of the University.

Connecting the cluster are several high-speed networks, the fastest of which is a two-level Intel Omni-Path Architecture network running at 100Gb/s. BC4’s storage is composed of one PetaByte of disk provided by DDN’s GS7k and IME systems running the parallel file system Spectrum Scale from IBM.

Effective benchmarking and optimisation, using the benchmarking capabilities of Lenovo’s HPC research centre in Stuttgart, the first of its kind, has ensured that BC4 is highly efficient in terms of physical footprint, while fully utilising the 30KW per rack energy limit. Lenovo’s commitment to third party integration has allowed the University to avoid vendor lock-in while permitting new hardware to be added easily between refresh cycles.

Dr Christopher Woods continues: “To help with the interactive use of the cluster, BC4 has a visualisation node equipped with NVIDIA Grid vGPUs so it helps our scientists to visualise the work they’re doing, so researchers can use the system even if they’ve not used an HPC machine before.”

Housed at VIRTUS’ LONDON4, the UK’s first shared data centre for research and education in Slough, BC4 is the first of the University’s supercomputers to be held at an independent facility. The system is directly connected to the Bristol campus via JISC’s high speed Janet network. Kelly Scott, account director, education at VIRTUS Data Centres said, “LONDON4 is specifically designed to have the capacity to host ultra high density infrastructure and high performance computing platforms, so an ideal environment for systems like BC4. The University of Bristol is the 22nd organisation to join the JISC Shared Data Centre in our facility, which enables institutions to collaborate and share infrastructure resources to drive real innovation that advances meaningful research.”

Currently numbering in the hundreds, applications running on the University’s previous cluster will be replicated onto the new system, which will allow researchers to create more applications and better scaling software. Applications are able to be moved directly onto BC4 without the need for re-engineering.

“We’re now in our tenth year of using HPC in our facility. We’ve endeavoured to make each phase of BlueCrystal bigger and better than the last, embracing new technology for the benefit of our users and researchers,” commented Caroline Gardiner, Academic Research Facilitator at the University of Bristol.

Simon Burbidge, Director of Advanced Computing comments: “It is with great excitement that I take on the role of Director of Advanced Computing at this time, and I look forward to enabling the University’s ambitious research programmes through the provision of the latest computational techniques and simulations.”

Due to be launched at an event on 24th May at the University of Bristol, BC4 will house over 1,000 system users, carried over from BlueCrystal Phase 3.

Read More »

Supporting scientific research at the Atomic Weapons Establishment

AWE benefiting from new end-to-end IBM Spectrum Scale and POWER8 systems

We are pleased to announce that we are supporting scientific research at the UK Atomic Weapons Establishment (AWE), with the design, testing and implementation of a new HPC, cluster and separate big data storage system.

AWE has been synonymous with science, engineering and technology excellence in support of the UK’s nuclear deterrent for more than 60 years. AWE, working to the required Ministry of Defence programme, provides and maintains warheads for the Trident nuclear deterrent.

The new HPC system is built on IBM’s POWER8 architecture and a separate parallel file system, called Cedar 3, built on IBM Spectrum Scale. In early benchmark testing, Cedar 3 is operating 10 times faster than the previous high-performance storage system at AWE. Both server and storage systems use IBM Spectrum Protect for data backup and recovery.

“Our work to maintain and support the Trident missile system is undertaken without actual nuclear testing, which has been the case ever since the UK became a signatory to the Comprehensive Nuclear Test Ban Treaty (CTBT); this creates extraordinary scientific and technical challenges – something we’re tackling head on with OCF,” comments Paul Tomlinson, HPC Operations at AWE. “We rely on cutting-edge science and computational methodologies to verify the safety and effectiveness of the warhead stockpile without conducting live testing. The new HPC system will be vital in this ongoing research.”

From the initial design and concept to manufacture and assembly, AWE works across the entire life cycle of warheads through the in-service support to decommissioning and disposal, ensuring the maximum safety and protecting national security at all times.

The central data storage, Cedar 3, will be in use for scientists across the AWE campus, with data replicated across the site.

“The work of AWE is of national importance and so its team of scientists need complete faith and trust in the HPC and big data systems in use behind the scenes, and the people deploying the technology,” says Julian Fielden, managing director, OCF. “Through our partnership with IBM, and the people, skills and expertise of our own team, we have been able to deliver a system which will enable AWE maintain its vital research,”

The new HPC system runs on a suite of IBM POWER8 processor-based Power systems servers running the IBM AIX V7.1 and Red Hat Enterprise Linux operating system. The HPC platform consists of IBM Power E880, IBM Power S824L, IBM Power S812L and IBM Power S822 servers to provide ample processing capability to support all of AWE’s computational needs and an IBM tape library device to back up computation data.

Cedar 3, AWE’s parallel file system storage, is an IBM Storwize storage system. IBM Spectrum Scale is in use to enable AWE to more easily manage data access amongst multiple servers.

About the Atomic Weapons Establishment (AWE)
The Atomic Weapons Establishment has been central to the defence of the United Kingdom for more than 60 years through its provision and maintenance of the warheads for the country’s nuclear deterrent. This encompasses the initial concept, assessment and design of the nuclear warheads, through component manufacture and assembly, in-service support, decommissioning and then disposal.

Around 4,500 staff are employed at the AWE sites together with over 2,000 contractors. The workforce consists of scientists, engineers, technicians, crafts-people and safety specialists, as well as business and administrative experts – many of whom are leaders in their field. The AWE sites and facilities are government owned but the UK Ministry of Defence (MOD) has a government-owned contractor-operated contract with AWE Management Limited (AWE ML) to manage the day-to-day operations and maintenance of the UK’s nuclear stockpile. AWE ML is formed of three shareholders – Lockheed Martin, Serco and Jacobs Engineering Group. For further information, visit: http://www.awe.co.uk

Read More »

eMedLab Shortlisted for UK Cloud Award

UK Cloud Awards Finalist logo

Congratulations eMedLab on being shortlisted for the UK Cloud Awards 2017

A solution designed and integrated by OCF has been shortlisted in the 2017 UK Cloud Awards in the ‘Best Public Sector Project’ category.

The MRC eMedLab consortium consists of University College London, Queen Mary University of London, London School of Hygiene & Tropical Medicine, the Francis Crick Institute, the Wellcome Trust Sanger Institute, the EMBL European Bioinformatics Institute and King’s College London and was funded by the Medical Research Council (£8.9M).

The vision of MRC eMedLab is to maximise the gains for patients and for medical research that will come from the explosion in human health data. To realise this potential, the consortium of seven prestigious biomedical research organisations need to accumulate medical and biological data on an unprecedented scale and complexity, to coordinate it, to store it safely and securely, and to make it readily available to interested researchers.

The partnership’s aim was to build a private cloud infrastructure for the delivery of significant computing capacity and storage to support the analysis of biomedical genomics, imaging and clinical data.  Initially, its main focus was on a range of diseases such as cancer, cardiovascular and rare diseases, subsequently broadening it out to include neurodegenerative and infectious diseases.

The MRC eMedLab system is a private cloud with significant data storage capacity and very fast internal networking designed specifically for the types of computing jobs used in biomedical research. The new high-performance and big data environment consists of:

  • Red Hat Enterprise Linux OpenStack Platform
  • Red Hat Satellite
  • Lenovo System x Flex system with 252 hypervisor nodes and Mellanox 10Gb network with a 40Gb/56Gb core
  • Five tiers of storage, managed by IBM Spectrum Scale (formerly GPFS), for cost effective data storage – scratch, Frequently Accessed Research Data, virtual clusters image storage, medium-term storage and previous versions backup.

The project has become a key infrastructure resource for the Medical Research Council (MRC), which has funded six of these projects. The success has been attributed to MRC eMedLab’s concept of partnership working where everybody is using one shared resource. This means not just sharing the HPC resource and sharing it efficiently, but also sharing the learning, the technology and the science at MRC eMedLab. Jacky Pallas, Director of Research Platforms, UCL, comments,From the beginning there was an excellent partnership between the MRC eMedLab operations team and the technical specialists at OCF, working together to solve the issues which inevitably arise when building and testing a novel compute and data storage system.”

In total, there are over 20 different projects running on the MRC eMedLab infrastructure which include:

  • The London School of Hygiene & Tropical Medicine is working on a project looking at population levels and the prevalence of HIV and TB, how the pathogen/bacteria evolve and the genetics of human resistance. This research is done in collaboration with researchers in Africa and Vietnam
  • Francis Crick Institute cancer based science – supporting a project run by Professor Charles Swanton investigating personalised immunotherapies against tumours
  • Great Ormond Street Hospital – collaboration on research on rare diseases in children
  • Linking genomics and brain imaging to better understand dementia
  • Studying rare mitochondrial diseases and understanding how stem cells function
  • Projects using the computing infrastructure use UK Biobank data to identify and improve treatments for cardiovascular diseases
  • Deep mining of cancer genomics data to understand how cancer tumours evolve
  • Analysing or looking at virus genome sequences to enable the modelling and monitoring of infectious flu type epidemic

The MRC eMedLab private cloud has shown that these new computing technologies can be used effectively to support research in the life sciences sector.

Professor Taane Clark, Professor of Genomics and Global Health, London School of Hygiene and Tropical Medicine comments, “The processing power of the MRC eMedLab computing resource has improved our ability to analyse human and pathogen genomic data, and is assisting us with providing insights into infectious disease genomics, especially in malaria host susceptibility, tuberculosis drug resistance and determining host-pathogen interactions.”

Read More »

Virtual HPC Clusters Enable Cancer, Cardio-Vascular and Rare Diseases Research OpenStack based Cloud enables cost-effective self-provisioned compute resources

eMedLab, a partnership of seven leading bioinformatics research and academic institutions, is using a new private cloud, HPC environment and big data system to support the efforts of hundreds of researchers studying cancers, cardio-vascular and rare diseases. Their research focuses on understanding the causes of these diseases and how a person’s genetics may influence their predisposition to the disease and potential treatment responses.

The new HPC cloud environment combines a Red Hat Enterprise Linux OpenStack Platform with Lenovo Flex System hardware to enable the creation of virtual HPC clusters bespoke to individual researchers’ requirements. The system has been designed, integrated and configured by OCF, an HPC, big data and predictive analytics provider, working closely with its partners Red Hat, Lenovo, Mellanox Technologies and in collaboration with eMedlab’s research technologists.

The High Performance Computing environment is being hosted at a shared data centre for education and research, offered by digital technologies charity Jisc. The data centre has the capacity, technological capability and flexibility to future-proof and support all of eMedLab’s HPC needs, with its ability to accommodate multiple and varied research projects concurrently in a highly collaborative environment. The ground-breaking facility is focused on the needs of the biomedical community and will revolutionise the way data sets are shared between leading scientific institutions internationally.

The eMedLab partnership was formed in 2014 with funding from the Medical Research Council. Original members University College London, Queen Mary University of London, London School of Hygiene & Tropical Medicine, the Francis Crick Institute, the Wellcome Trust Sanger Institute and the EMBL European Bioinformatics Institute have been joined recently by King’s College London.

“Bioinformatics is a very, very data intensive discipline,” says Jacky Pallas, Director of Research Platforms, University College London. “We want to study a lot of de-identified, anonymous human data. It’s not practical – from data transfer and data storage perspectives – to have scientists replicating the same datasets across their own, separate physical HPC resources, so we’re creating a single store for up to 6 Petabytes of data and a shared HPC environment within which researchers can build their own virtual clusters to support their work.”
The Red Hat Enterprise Linux OpenStack Platform, a highly scalable Infrastructure-as-a-Service [IaaS] solution, enables scientists to create and use virtual clusters bespoke to their needs, allowing them to select compute memory, processors, networking, storage and archiving policies, all orchestrated by a simple web-based user-Interface. Researchers will be able access up to 6,000 cores of processing power.

“We generate such large quantities of data that it can take weeks to transfer data from one site to another,” says Tim Cutts, Head of Scientific Computing, the Wellcome Trust Sanger Institute. “Data in eMedLab will stay in one secure place and researchers will be able to dynamically create their own virtual HPC cluster to run their software and algorithms to interrogate the data, choosing the number of cores, operating system and other attributes to create the ideal cluster for their research.”

Tim adds: “The Red Hat Enterprise Linux OpenStack Platform enables our researchers to do this rapidly and using open standards which can be shared with the community.”

Arif Ali, Technical Director of OCF says: “The private cloud HPC environment offers a flexible solution through which virtual clusters can be deployed for specific workloads. The multi-tenancy features of the Red Hat platform enable different institutions and research groups to securely co-exist on the same hardware, and share data when appropriate.”

“This is a tremendous and important win for Red Hat,” says Radhesh Balakrishnan, general manager, OpenStack, Red Hat. “eMedLab’s deployment of Red Hat Enterprise Linux OpenStack Platform into its HPC environment for this data intensive project further highlights our leadership in this space and ability to deliver a fully supported, stable, and reliable production-ready OpenStack solution.

Red Hat technology allows consortia such as eMedLab to use cutting edge self-service compute, storage, networking, and other new services as these are adopted as core OpenStack technologies, while still offering the world class service and support that Red Hat is renowned for. The use of Red Hat Enterprise Linux OpenStack Platform provides cutting edge technologies along with enterprise-grade support and services; leaving researchers to focus on the research and other medical challenges.”

“Mellanox end-to-end Ethernet solutions enable cloud infrastructures to optimize their performance and to accelerate big data analytics,” said Kevin Deierling, vice president of marketing at Mellanox Technologies. “Intelligent interconnect with offloading technologies, such as RDMA and cloud accelerations, is key for building the most efficient private and cloud environments. The collaboration between the organisations as part of this project demonstrates the power of the eco-systems to drive research and discovery forward.”
The new high-performance environment and big data environment consists of:
 Red Hat Enterprise Linux OpenStack Platform
 Red Hat Satellite
 Lenovo System x Flex system with 252 hypervisor nodes and Mellanox 10Gb network with a 40Gb/56Gb core
 Five tiers of storage, managed by IBM Spectrum Scale (formerly GPFS), for cost effective data storage – scratch, Frequently Accessed Research Data, virtual clusters image storage, medium-term storage and previous versions backup.

Read More »

BlueBEAR at the University of Birmingham

A team of researchers based at The University of Birmingham is working on ground-breaking research to create a proton Computed Tomography (CT) image that will help to facilitate treatment of cancer patients in the UK. Proton therapy targets tumours very precisely using a proton beam and can cause less damage to surrounding tissue than conventional radiotherapy – for this reason it can be beneficial treatment for children.

Generally reliant on X-rays to image the body’s composition and healthy tissue location before treatment, this research is hoping to simulate use of actual protons – not X-rays – to image the body – and in doing so improve accuracy of the final treatment. It forms part of a larger research project set up to build a device capable of delivering protons in this way in a clinical setting.

Working for the PRaVDA Consortium, a three-year project funded by the Wellcome Trust* and led by researchers at the University of Lincoln, the team of researchers are using The University of Birmingham’s centrally funded High Performance Computing (HPC) service, BlueBEAR, to simulate the use of protons for CT imaging. The team hopes to simulate 1000 million protons per image over the course of the project, and will do so 97 per cent faster than on a desktop computer. A test simulation of 180 million protons, which would usually take 5400 hours without the cluster, has already been delivered in 72 hours / 3-days.

The research team is tasked with proving the principle that a 10cm proton CT image, similar in size to a child’s head, can be created. In doing so, it will be the largest proton CT image ever created.

Dr Tony Price, PRaVDA research fellow, says, “The research will give us a better understanding of how a proton beam interacts with the human body, ultimately improving the accuracy of proton therapy. The HPC service at The University of Birmingham is essential for us to complete our research, as it gives us the necessary capacity to simulate and record the necessary number of histories to create an image. It took us only three days to run a simulation of 180 million protons which would usually take 5400 hours without the cluster.”

The BlueBEAR HPC service in use by the PRaVDA Consortium was designed, built and integrated in 2013 by HPC, data management, storage and analytics provider OCF. Due to the stability and reliability of the core service, researchers have invested in expanding this service with nodes purchased from their own grants and integrated into the core service on the understanding that these nodes will be available for general use when not required by the research group.

This has expanded the capacity of the centrally-funded service by 33 per cent, showing the confidence that the researchers have in the core service. The service is used by researchers from the whole range of research areas at the University, from the traditional HPC users in the STEM (Science, Technology, Engineering and Mathematics) disciplines to non-traditional HPC users such as Psychology and Theology.

Paul Hatton, HPC & Visualisation Specialist, IT Services, The University of Birmingham says, “The HPC service built by OCF has proven over the past two years to be of immense value to a multitude of researchers at the University. Instead of buying small workstations, researchers are using our central HPC service because it is easy for them to buy and add their own cores when required.

We work closely with OCF to encourage new users onto the service and provide a framework for users requesting capacity. The flexible, scalable and unobtrusive design of the high performance clusters has made it easy for us to scale up our HPC service according to the increase in demand.”

Technology

  • The server clusters uses Lenovo System x iDataPlex® with Intel Sandy Bridge processors. OCF has installed more high performance server clusters using the industry-leading Lenovo iDataPlex server than any other UK integrator.
  • The server clusters also uses IBM Tivoli Storage Manager for data back up and IBM GPFS software which enables more effective storage capacity expansion, enterprise wide, interdepartmental file sharing, commercial-grade reliability, cost-effective disaster recovery and business continuity.
  • The scheduling system on BlueBEAR is Adaptive Computing’s MOAB software, which enables the scheduling, managing, monitoring, and reporting of HPC workloads.
  • Use of Mellanox’s Virtual Protocol Interconnect (VPI) cards within the cluster design would make it easier for IT Services to redeploy nodes between the various components of the BEAR services depending on changing workloads should the demand arise.

 

Read More »

CLIMB: World’s largest single system dedicated to Microbial Bioinformatics research

World’s largest single system dedicated to Microbial Bioinformatics research

Bioinformatics researchers from across the UK and international collaborators will soon be able to benefit from a new private cloud HPC system to support their work on bacterial pathogens. The Cloud Infrastructure for Microbial Bioinformatics (CLIMB) project, a collaboration between the University of Birmingham, the University of Warwick, Cardiff University, and Swansea University, will create a free-to-use, world leading cyber infrastructure specifically designed for microbial bioinformatics research.

With over 7,500 vCPU cores of processing power, the CLIMB system represents the largest single system designed specifically for microbiologists in the world.

Dr. Thomas Connor, Senior Lecturer at the Cardiff School of Biosciences, Cardiff University, who designed the system with integration partner OCF says: “Bioinformatics research using the system is already helping to track viral and bacterial pathogens, develop new diagnostics, increase the understanding of bacterial resistance to antibiotics, and support many other related research projects. Using CLIMB we are able to overcome the difficulties we face when trying to perform our analyses on HPC infrastructure that is generally not suitable for microbial bioinformatics workloads.”

Powered by OpenStack cloud computing software, and provided by HPC, big data and predictive analytics provider OCF, one site’s system, at the University of Birmingham, is already in production running OpenStack Juno, and will soon be linked directly to the system at Cardiff. The final two sites are currently undergoing final testing by OCF before entering production this Autumn. CLIMB is already contributing to bioinformatics research both nationally and internationally, and research using the system has already been published in international scientific journals.

Simon Thompson, research computing specialist at the University of Birmingham, who built the initial CLIMB pilot system says “this is one of the most complex environments we’ve had to build, learning a lot of new technologies. Luckily we’ve been able to work with our contacts at IBM, Lenovo and OCF to help us build what is now quite a stable platform in Birmingham”.

To provide users with local high performance compute and storage, the fully open source OpenStack system is built using Lenovo System X servers connected to IBM Spectrum Scale storage connected through 56GB Infiniband – providing 500TB local storage at each of the four sites.

Using a set of standardised Virtual Machine images, the CLIMB system can spin up over 1,000 VMs at any one time enabling individuals and groups of researchers to take advantage of the system’s 78TBs of RAM. The system has been designed to provide large amounts of RAM in order to meet the challenge of processing large, rich, biological datasets and the system will be able to support the vast majority of the UK microbiology community.

“We are now able to rapidly take multiple bacterial samples and generate DNA sequence data for each sample in a computationally readable format. However, handling hundreds, or thousands of these samples simultaneously poses serious challenges for most HPC systems,” says Dr. Connor. “Datasets within microbiology are very different to traditional HPC research. Workloads are often either embarrassingly parallel or very high memory, and all require large amounts of high performance storage. CLIMB has been specifically designed to take this into account, to enable the microbial bioinformatics community within the UK to access facilities they are unlikely to have available locally.”

Dr. Connor continues: “One of the issues that is becoming increasingly faced by bioinformaticians is the ability to share bespoke software applications. What may work on one HPC system, won’t necessarily work on another – it’s often quicker to write a new application from scratch. We believe that by using containers/Virtual Machines this problem can be overcome, by creating a mechanism to share software and data within a single cloud environment. By enabling researchers to share their software and data in this way, we free up bioinformaticians to spend more time doing research, and less on installing software and downloading data from multiple, disparate data repositories.”

“The fully open source OpenStack system enables us to work with the team at CLIMB to help modify and improve the system,” comments Arif Ali, Technical Director at OCF. “CLIMB really is a leading edge, private cloud solution and using OpenStack means we can stay on top of the latest developments in the system as well as modify it to fit the needs of researchers. On top of this, the research community can also contribute to the system meaning that the solution can fit all of their research needs.”

OCF will continue to work with the institutes in CLIMB to manage the open source OpenStack system and is currently upgrading the systems installed at Warwick and Swansea to OpenStack Kilo.

Read More »

New HPC cluster benefitting University of Oxford

New HPC cluster benefitting University of Oxford

Researchers from across the University of Oxford will benefit from a new High Performance Computing system designed and integrated by OCF. The new, Advanced Research Computing (ARC) central HPC resource is supporting research across all four Divisions at the University; Mathematical, Physical and Life Sciences; Medical Sciences; Social Sciences; and Humanities.

With around 120 active users per month, the new HPC resource will support a broad range of research projects across the University. As well as computational chemistry, engineering, financial modeling, and data mining of ancient documents, the new cluster will be used in collaborative projects like the T2K experiment using the J-PARC accelerator in Tokai, Japan. Other research will include the Square Kilometer Array (SKA) project, and anthropologists using agent-based modeling to study religious groups. The new service will also be supporting the Networked Quantum Information Technologies Hub (NQIT), led by Oxford, envisaged to design new forms of computers that will accelerate discoveries in science engineering and medicine.

The new HPC cluster comprises of Lenovo NeXtScale servers with Intel Haswell CPUs connected by 40GB Infiniband to an existing Panasas storage system. The storage system was also upgraded by OCF to add 166TBs giving a total of 400TBs of capacity. Existing Intel Ivy Bridge and Sandy Bridge CPUs from the University of Oxford’s older machine are still running and will be merged with the new cluster.
20 NVIDIA Tesla K40 GPUs were also added at the request of NQIT, who co-invested in the new machine. This will also bring benefit to NVIDIA’s CUDA Centre of Excellence, which is also based at the University.

“After seven years of use, our old SGI-based cluster really had come to end of life, it was very power hungry, so we were able to put together a good business case to invest in a new HPC cluster,” said Dr Andrew Richards, Head of Advanced Research Computing at the University of Oxford. “W e can operate the new 5,000 core machine for almost exactly the same power requirements as our old 1,200 core machine.

The new cluster will not only support our researchers but will also be used in collaborative projects as well; we’re part of Science Engineering South, a consortium of five universities working on e-infrastructure particularly around HPC.

We also work with commercial companies who can buy time on the machine so the new cluster is supporting a whole host of different research across the region.”

Simple Linux Utility Resource Manager (SLURM) job scheduler manages the new HPC resource, which is able to support both the GPUs and the three generations of Intel CPUs within the cluster.

Julian Fielden, Managing Director at OCF comments: “W ith Oxford providing HPC not just to researchers within the University, but to local businesses and in collaborative projects, such as the T2K and NQIT projects, the SLURM scheduler really was the best option to ensure different service level agreements can be supported. If you look at the Top500 list of the World’s fastest supercomputers, they’re now starting to move to SLURM. The scheduler was specifically requested by the University to support GPUs and the heterogeneous estate of different CPUs, which the previous TORQUE scheduler couldn’t, so this forms quite an important part of the overall HPC facility.”

The University of Oxford will be officially unveiling the new cluster, named Arcus Phase B, on 14th April. Dr Richards continues: “As a central resource for the entire University, we really see ourselves as the first stepping stone into HPC. From PhD students upwards i.e. people that haven’t used HPC before – are who we really want to engage with. I don’t see our facility as just running a big machine; we’re here to help people do their research. That’s our value proposition and one that OCF has really helped us to achieve.”

Read More »

Imperial College London uses new Half PB storage to support research and e-learning

Imperial College London uses new Half PB storage to support research and e-learning

A new highly-scalable storage system is supporting staff, students and researchers across multiple departments at Imperial College London. Designed and implemented by OCF, the centrally-funded half PetaByte system enables secure, cost-effective storage of vast amounts of research data produced at the College, whilst underpinning a Panopto video capture system that records lectures from up to 100 separate theatres on a daily basis.

The new Linux-based, IBM GPFS storage system sits across two separate data centres. It replaces a legacy Microsoft solution that had grown on an ad-hoc basis and become increasingly difficult to grow and administer.

As well as long-term, cost-effective, secure storage requirements of research data, e-learning is a major area of focus, so the College needed a solution that was capable of storing captured and encoded video recordings of lectures, making them available for editing and viewing by potentially 20,000 students. Multiple lectures can be recorded in parallel on the system, then saved and encoded with a turnaround time of less than one day from raw video capture to encoded version. Uptime was also a key factor – the College wanted a solution that would remain online 24/7 to support students.

The new system, which comprises of 8x IBM x3650 M4 GPFS clustered storage and NAS servers, along with 4x IBM Storwize v3700 storage arrays provides increased uptime, greater resilience, and is more responsive, incorporating a number of disaster recovery features to protect against data loss. In the event of data loss or corruption, users can also recover previous data copies from up to 14 days themselves; saving the user time and reducing the burden on IT administrators significantly.

“We wanted a system that could grow and scale more easily and protect large data sets more effectively and efficiently,” said Steven Lawlor, ICT Datacentre Services Manager at Imperial College London. “Our old system no longer scaled and was becoming a bit of a beast. The new system will be easier to expand and more cost effective, in part because we can now move data to tape for long-term storage, a key requirement for certain types of research data. Uptime is one of the most important factors, we wanted a system that could also be maintained and expanded whilst still running.”

Julian Fielden, Managing Director of OCF comments: “We’re seeing a massive explosion of data growth. Academia is feeling the pressure of keeping up with storage demands, especially around research data. Imperial College London’s real challenge was around storing the lecture capture side. Research data can be broken up and separated, whereas its video capture system was one large system across the College. By developing a scalable storage solution based on GPFS they are now able to grow and expand storage in a simple and cost-effective manner.”

Imperial College London has already migrated all of its video content to GPFS. It is currently migrating research datasets from the Microsoft environment to the new Linux-based storage system.

Read More »

New Cluster for Statistical Genetics Research at the Wellcome Trust Centre

New cluster propels Research Centre’s electron microscopy research into World league

The Wellcome Trust Centre for Human Genetics (WTCHG) at the University of Oxford is using a new Fujitsu high-performance BX900 blade-based cluster with Mellanox InfiniBand and DataDirect Networks storage systems integrated by OCF to support the genetics research of 25 groups and more than 100 researchers.

The Centre houses the second largest next-generation sequencing facility in England, currently producing more than 500 genomes per year. Each processed and compressed genome is about 30GB on disk and across the Centre roughly 15,000-20,000 human genomes occupy about 0.5PB. Numerous and wide-ranging research projects use this data to study the genetic basis of human diseases based on sophisticated statistical genetics analyses. Projects include national and international studies on various cancers, type-2 diabetes, obesity, malaria and analyses of bacterial genomes to trace the spread of infection. The Centre is one of the most highly ranked research institutes in the world and funds for the cluster were provided by a grant from the Wellcome Trust.

By understanding the characteristics of key genetics software applications and optimising how they map onto the new cluster’s architecture, the Centre has been able to improve dramatically the efficiency of these analyses. For example, analyses of data sets that took months using the Broad Institute’s Genome Analysis Tool Kit (GATK) can now be completed in weeks while using fewer cores.

The new cluster has also proved itself to be perfectly suited to supporting research by the Centre’s Division of Structural Biology (STRUBI) and it has already produced some of the world’s highest-resolution electron microscopy reconstructions – revealing structural details vital to understanding processes such as infection and immunity. The improvement in the performance of electron microscopy codes, particularly Relion, is also very impressive: movie-mode processing requiring more than 2 weeks on eight 16-core nodes of a typical cluster is now completed in 24 hours on just six of the new FDR-enabled, high-memory nodes.

“Advances in detector design and processing algorithms over the past two years have revolutionised electron microscopy, making it the method of choice for studying the structure of complex biological processes such as infection. However, we thought we could not get sufficient access to the necessary compute to exploit these advances fully. The new genetics cluster provided such a fast and cost-effective solution to our problems that we invested in expanding it immediately,” Professor David Stuart, Oxford University

The new cluster’s use of Intel Ivy Bridge CPUs provides a 2.6x performance increase over its predecessor built in 2011. It boasts 1,728 cores of processing power, up from 912, with 16GB 1866MHz memory per core compared to a maximum of 8GB per core on the older cluster.

The new cluster is working alongside a second production cluster; both clusters share a common Mellanox FDR InfiniBand network that links the compute nodes to a DDN GRIDScaler SFA12K storage system whose controllers can read block data at 20GB/s. This speed is essential for keeping the cluster at maximum utilisation and consistently fed with genomic data.

The high-performance cluster and big data storage systems were designed by the WTCHG in partnership with OCF, a leading HPC, data management, big data storage and analytics provider. As the integrator, OCF also provided the WTCHG team with training on the new system.

Dr Robert Esnouf, Head of the Research Computing Core at the WTCHG says:

  • “Processing data from sequencing machines isn’t that demanding in terms of processing power any more. What really stresses systems are ‘all-against-all’ analyses of hundreds of genomes, that is lining up multiple genomes against each other and using sophisticated statistics to compare them and spot differences which might explain the genetic origin of diseases or susceptibility to diseases. That is a large compute and data I/O problem and most of our users want to complete this type of research.
  • Each research group can use their own server to submit jobs to, and receive results from, the cluster. If it runs on the server it can easily be redirected to the cluster. Users don’t need to logon directly to the cluster or be aware of other research groups using it. We try to isolate groups so they don’t slow each other down and have as simple an experience as possible. Users have Linux skills, but they do not need to be HPC experts to use the system safely and effectively. It is a deliberate design goal.
  • We use DDN GRIDScaler SFA12K-20 for our main storage – that has 1.8PB raw storage in one IBM General Parallel File System [GPFS] file system. We have learnt a lot about GPFS and how to get codes running efficiently on it. The support team and technical documentation showed how we could exploit the local page pool (cache) to run our codes much more quickly. Our system serves files over the InfiniBand fabric into the cluster at up to 10GB/s (~800TB/day). The SFA12K is already >80% full, so we’re now offloading older data to a slower, less expensive disk tier to get maximum return on the SFA12K investment.
  • Along with the 0.5PB storage pool on the sequencing cluster and other storage we now squeeze ~5PB storage and 4000 compute cores into 10 equipment racks in a small converted freezer room. Despite its small footprint, it is one of the most powerful departmental compute facilities in a UK university.
  • OCF worked with us to design a high-specification cluster and storage system that met our needs; they then delivered it and integrated it on time and to budget. In fact, the OCF-Fujitsu-Mellanox-DDN package was the clear winner from almost all perspectives – being based on GPFS, to which we were already committed; winning price / performance combination, low power consumption, fast I/O and simplicity of installation. We even managed to afford a pair of additional cluster nodes with 2TB real memory each for really complex jobs – also through OCF-Fujitsu!”
Read More »

Recent Comments

    August 2017
    Monday Tuesday Wednesday Thursday Friday Saturday Sunday
    31st July 2017 1st August 2017 2nd August 2017 3rd August 2017 4th August 2017 5th August 2017 6th August 2017
    7th August 2017 8th August 2017 9th August 2017 10th August 2017 11th August 2017 12th August 2017 13th August 2017
    14th August 2017 15th August 2017 16th August 2017 17th August 2017 18th August 2017 19th August 2017 20th August 2017
    21st August 2017 22nd August 2017 23rd August 2017 24th August 2017 25th August 2017 26th August 2017 27th August 2017
    28th August 2017 29th August 2017 30th August 2017 31st August 2017 1st September 2017 2nd September 2017 3rd September 2017

    Contact Us

    HEAD OFFICE:
    OCF plc
    Unit 5 Rotunda, Business Centre,
    Thorncliffe Park, Chapeltown,
    Sheffield, S35 2PG

    Tel: +44 (0)114 257 2200
    Fax: +44 (0)114 257 0022
    E-Mail: info@ocf.co.uk

    SUPPORT DETAILS:
    OCF Hotline: 0845 702 3829
    E-Mail: support@ocf.co.uk
    Helpdesk: support.ocf.co.uk

    DARESBURY OFFICE:
    The Innovation Centre, Sci-Tech Daresbury,
    Keckwick Lane, Daresbury,
    Cheshire, WA4 4FS

    Tel: +44 (0)1925 607 360
    Fax: +44 (0)114 257 0022
    E-Mail: info@ocf.co.uk

    OCF plc is a company registered in England and Wales. Registered number 4132533. Registered office address: OCF plc, 5 Rotunda Business Centre, Thorncliffe Park, Chapeltown, Sheffield, S35 2PG

    Website Designed & Built by Grey Matter | web design sheffield