Telephone: 0114 2572 200

RES Uses Supercomputer to Harness Wind

Renewable energy developer, Renewable Energy Systems Group (RES) is using a new high performance cluster and big data storage system to support the design, build and operation of wind farms globally. The system went live in March 2014.

The cluster, which is x7 more powerful than its predecessor, is enabling analysts to more efficiently undertake work such as wind resource mapping [for turbine placement], historical time series generation, project design refinement and wind turbulence modelling, using CFD. The cluster uses Cray Intel servers designed, installed and maintained by high performance data processing, management, storage and analytics provider OCF.

The technical team at RES can also now create longer mesoscale simulations, for example reviewing 10-years’ worth of wind data for a specific local area, as opposed to just 1-years’ worth of data using the old cluster. This improves the accurate placement of wind turbines and enables more detailed energy yield analysis. The team can also experiment with more complex CFD models. Previously, picking the days that would act as a long-term representation of conditions, they are now able to model all environmental elements, making fewer assumptions and making more accurate decisions.

Due to the increase in performance of the cluster, the number of analysts capable of using it has increased from 3-4 to around 50. The analysts are based in the UK and internationally.

“We had a cluster before which was very helpful and very successful in supporting our work,” says Clément Bouscasse, Forecasting and Flow Modelling Manager, RES Group. “As our business grew, we found ourselves using the cluster more and more. After 4 years, we outgrew it. We chose Cray’s hardware and, in turn, Cray directed us to OCF for the design and build. Due to power and cooling constraints in our main site, we worked with OCF to build the cluster in an off-site data centre.”

The cluster is supported by 128TBs of high capacity, big data storage, built using Boston SATA storage connected via 10GbE to the main data network.

Clément adds: “In our new high performance storage system we can store the files analysts need access to on a daily basis. We generate a lot of data – but fortunately after processing, file sizes reduce from potentially Terabytes [TBs] to maybe only a few hundred Megabytes. We’re making every effort to reduce the size of files we produce and trying to be extra efficient in the post process. Despite this, we’ve already generated a few TBs in just a couple of months.”

RES also has access to x2 IBM Ultrium 6-drive Autoloader tape libraries, which it is using for data backup purposes. Clément continues: “In the first instance, we’re using the tape library as a back-up system. In the future, we hope to also use it as an efficient archiving solution, which is key for us since the cluster is hosted off site. This should limit the number of required visits to the data centre.”

Read More »

High Performance Computing and Big Data Helps Sequence Infectious Diseases

Integration by OCF helps to analyse sequences of pathogen DNA samples to provide public service

Public Health England (PHE) the executive agency of the Department of Health is using a new big data storage system based on DataDirect Networks (DDN) storage alongside an existing high-performance server cluster to enable faster and effective analysis of genome sequences. These sequences are then used in PHE activities for diagnostics and surveillance of infectious diseases.

The implementation, configuration and integration of the several components of the big data system and cluster have been supported by big data processing, management, storage and analytics provider, OCF.

In December 2012, Prime Minister David Cameron announced the ‘100,000 Genome Project’ where the personal DNA code of up to 100,000 patients, or infections in patients, will be decoded. The Department of Health prioritised a number of areas with infectious disease sequencing undertaken by PHE. PHE has laboratories across England receiving thousands of biological samples per week from patients with unidentified and potentially aggressive pathogens that need urgent identification. This project supports PHE’s goal of being a leader in the adoption of genomics in clinical microbiology to support public health interventions in a quicker and more cost-effective manner.

PHE uses Illumina sequencing machines (HiSeq, MiSeq) to generate DNA sequence data from diverse pathogen (bacteria and viruses) samples. A high performance computing cluster, integrated in July 2013 by OCF, is used to assemble and analyse genetic information to provide accurate diagnostics and rapid identification of outbreaks, thereby helping patients and delivering public heath interventions more effectively.

Specifically, the cluster helps by parallelising the analysis process of generated sequences, thereby reducing significantly the time taken to analyse hundreds of genomes to as little as couple of hours (or less) compared to many hours on a normal workstation where analysis is done in a sequential order.
PHE is also now using 300TB of high performance DDNTM SFA® storage integrated by OCF. PHE keeps data for around 3-4 months enabling numerous researchers to analysis data sets simultaneously. The data is then tiered off to a DDN storage archive and also made available for sharing with clinical partners and other research organisations.

The system uses DDN, HP and IBM hardware, Open Source software including Linux and xCAT and commercial software and technical support from Univa and Red Hat. The system includes:
• HP BladeSystem c7000 with 16 server HP Blade BL460c Gen8
• IBM x3650 nodes for data management services
• To support massive performance and data growth requirements, OCF installed DDN SFA storage, and EXAScaler™ appliance with Lustre® File System
o Configured usable capacity: 300TB (7x 8+P+P)
o Configured performance: 2.5GB/s (OSS-Storage Capability)
o Maximum performance: 6GB/s

The future
In the coming few weeks PHE will expand the high-performance IBM cluster again with another 16 compute nodes further increasing the possible parallelization in the analysis process of generated sequences. PHE is also expanding its archiving storage capacity with DDN WOS® cloud storage (with 250TB of capacity) and implementing the open-source data grid software iRODS to help organise, share, protect, and preserve scientific data. This additional system will also enable:
• Creation of a private cloud environment where researchers can access geographically dispersed and replicated file data using the fastest nodes [normally the data source], not necessarily the closest nodes to their location, improving researchers productivity
• Accessing of data at the same time increasing collaboration amongst researchers
• Addition of metadata to standard file system data enabling researchers to search, browse and retrieve data more quickly

Read More »

Recent Comments

    Contact Us

    OCF plc
    Unit 5 Rotunda, Business Centre,
    Thorncliffe Park, Chapeltown,
    Sheffield, S35 2PG

    Tel: +44 (0)114 257 2200
    Fax: +44 (0)114 257 0022

    OCF Hotline: 0845 702 3829

    The Innovation Centre, Sci-Tech Daresbury,
    Keckwick Lane, Daresbury,
    Cheshire, WA4 4FS

    Tel: +44 (0)1925 607 360
    Fax: +44 (0)114 257 0022

    OCF plc is a company registered in England and Wales. Registered number 4132533. Registered office address: OCF plc, 5 Rotunda Business Centre, Thorncliffe Park, Chapeltown, Sheffield, S35 2PG

    Website Designed & Built by Grey Matter | web design sheffield