Telephone: 0114 2572 200

Imperial College London uses new Half PB storage to support research and e-learning

Imperial College London uses new Half PB storage to support research and e-learning

A new highly-scalable storage system is supporting staff, students and researchers across multiple departments at Imperial College London. Designed and implemented by OCF, the centrally-funded half PetaByte system enables secure, cost-effective storage of vast amounts of research data produced at the College, whilst underpinning a Panopto video capture system that records lectures from up to 100 separate theatres on a daily basis.

The new Linux-based, IBM GPFS storage system sits across two separate data centres. It replaces a legacy Microsoft solution that had grown on an ad-hoc basis and become increasingly difficult to grow and administer.

As well as long-term, cost-effective, secure storage requirements of research data, e-learning is a major area of focus, so the College needed a solution that was capable of storing captured and encoded video recordings of lectures, making them available for editing and viewing by potentially 20,000 students. Multiple lectures can be recorded in parallel on the system, then saved and encoded with a turnaround time of less than one day from raw video capture to encoded version. Uptime was also a key factor – the College wanted a solution that would remain online 24/7 to support students.

The new system, which comprises of 8x IBM x3650 M4 GPFS clustered storage and NAS servers, along with 4x IBM Storwize v3700 storage arrays provides increased uptime, greater resilience, and is more responsive, incorporating a number of disaster recovery features to protect against data loss. In the event of data loss or corruption, users can also recover previous data copies from up to 14 days themselves; saving the user time and reducing the burden on IT administrators significantly.

“We wanted a system that could grow and scale more easily and protect large data sets more effectively and efficiently,” said Steven Lawlor, ICT Datacentre Services Manager at Imperial College London. “Our old system no longer scaled and was becoming a bit of a beast. The new system will be easier to expand and more cost effective, in part because we can now move data to tape for long-term storage, a key requirement for certain types of research data. Uptime is one of the most important factors, we wanted a system that could also be maintained and expanded whilst still running.”

Julian Fielden, Managing Director of OCF comments: “We’re seeing a massive explosion of data growth. Academia is feeling the pressure of keeping up with storage demands, especially around research data. Imperial College London’s real challenge was around storing the lecture capture side. Research data can be broken up and separated, whereas its video capture system was one large system across the College. By developing a scalable storage solution based on GPFS they are now able to grow and expand storage in a simple and cost-effective manner.”

Imperial College London has already migrated all of its video content to GPFS. It is currently migrating research datasets from the Microsoft environment to the new Linux-based storage system.

Read More »

OCF joins the OpenPOWER Foundation

OCF will contribute to innovations across the full software and hardware POWER stack

OCF has joined the OpenPOWER Foundation, an open development community based on the POWER microprocessor architecture, as a silver member.

The OpenPOWER foundation is an open, not-for-profit technical community created to enable today’s data centres to rethink their approach to technology. The aim of the foundation is to create a collaborative ecosystem for members to share expertise, investment, and server-class intellectual property to serve the evolving needs of customers and industry.

OCF joins a growing list of technology organisations working to build and develop advanced server, networking, storage and acceleration technology, as well as industry leading open source software. Aimed at delivering more choice, control and flexibility to developers of next-generation, hyperscale and cloud data centres, the group makes POWER hardware and software available to open development for the first time. The foundation also makes POWER intellectual property licensable to others, greatly expanding the ecosystem of innovators on the platform.

“We are not only excited about the technology but also the momentum the community is gathering from its members including Google, IBM, Mellanox, the Science & Technology Facilities Council (STFC) and Universities from across the world, who are working collaboratively to drive innovation in this technology,” says Andrew Dean, OCF business development manager.

“OCF are proud to be associated with the OpenPOWER Foundation,” adds OCF managing director Julian Fielden. “As members of the foundation it will improve OCF’s understanding of what is being achieved within the OpenPOWER community and will enable both OCF and its customers to take advantage of the technology. Using our extensive experience of the UK Research and High Performance sector, it will also give us the opportunity to feed our knowledge and skills back into the community.”

“The development model of the OpenPOWER Foundation is one that elicits collaboration and represents a new way in exploiting and innovating around processor technology,” says Brad McCredie, OpenPOWER President and IBM Fellow. “With the POWER architecture designed for Big Data and Cloud, new OpenPOWER Foundation members like OCF will be able to add their own innovations on top of the technology to create new applications that capitalise on emerging workloads.”
To learn more about OpenPOWER and to view the complete list of current members, visit

Read More »

New Cluster for Statistical Genetics Research at the Wellcome Trust Centre

New cluster propels Research Centre’s electron microscopy research into World league

The Wellcome Trust Centre for Human Genetics (WTCHG) at the University of Oxford is using a new Fujitsu high-performance BX900 blade-based cluster with Mellanox InfiniBand and DataDirect Networks storage systems integrated by OCF to support the genetics research of 25 groups and more than 100 researchers.

The Centre houses the second largest next-generation sequencing facility in England, currently producing more than 500 genomes per year. Each processed and compressed genome is about 30GB on disk and across the Centre roughly 15,000-20,000 human genomes occupy about 0.5PB. Numerous and wide-ranging research projects use this data to study the genetic basis of human diseases based on sophisticated statistical genetics analyses. Projects include national and international studies on various cancers, type-2 diabetes, obesity, malaria and analyses of bacterial genomes to trace the spread of infection. The Centre is one of the most highly ranked research institutes in the world and funds for the cluster were provided by a grant from the Wellcome Trust.

By understanding the characteristics of key genetics software applications and optimising how they map onto the new cluster’s architecture, the Centre has been able to improve dramatically the efficiency of these analyses. For example, analyses of data sets that took months using the Broad Institute’s Genome Analysis Tool Kit (GATK) can now be completed in weeks while using fewer cores.

The new cluster has also proved itself to be perfectly suited to supporting research by the Centre’s Division of Structural Biology (STRUBI) and it has already produced some of the world’s highest-resolution electron microscopy reconstructions – revealing structural details vital to understanding processes such as infection and immunity. The improvement in the performance of electron microscopy codes, particularly Relion, is also very impressive: movie-mode processing requiring more than 2 weeks on eight 16-core nodes of a typical cluster is now completed in 24 hours on just six of the new FDR-enabled, high-memory nodes.

“Advances in detector design and processing algorithms over the past two years have revolutionised electron microscopy, making it the method of choice for studying the structure of complex biological processes such as infection. However, we thought we could not get sufficient access to the necessary compute to exploit these advances fully. The new genetics cluster provided such a fast and cost-effective solution to our problems that we invested in expanding it immediately,” Professor David Stuart, Oxford University

The new cluster’s use of Intel Ivy Bridge CPUs provides a 2.6x performance increase over its predecessor built in 2011. It boasts 1,728 cores of processing power, up from 912, with 16GB 1866MHz memory per core compared to a maximum of 8GB per core on the older cluster.

The new cluster is working alongside a second production cluster; both clusters share a common Mellanox FDR InfiniBand network that links the compute nodes to a DDN GRIDScaler SFA12K storage system whose controllers can read block data at 20GB/s. This speed is essential for keeping the cluster at maximum utilisation and consistently fed with genomic data.

The high-performance cluster and big data storage systems were designed by the WTCHG in partnership with OCF, a leading HPC, data management, big data storage and analytics provider. As the integrator, OCF also provided the WTCHG team with training on the new system.

Dr Robert Esnouf, Head of the Research Computing Core at the WTCHG says:

  • “Processing data from sequencing machines isn’t that demanding in terms of processing power any more. What really stresses systems are ‘all-against-all’ analyses of hundreds of genomes, that is lining up multiple genomes against each other and using sophisticated statistics to compare them and spot differences which might explain the genetic origin of diseases or susceptibility to diseases. That is a large compute and data I/O problem and most of our users want to complete this type of research.
  • Each research group can use their own server to submit jobs to, and receive results from, the cluster. If it runs on the server it can easily be redirected to the cluster. Users don’t need to logon directly to the cluster or be aware of other research groups using it. We try to isolate groups so they don’t slow each other down and have as simple an experience as possible. Users have Linux skills, but they do not need to be HPC experts to use the system safely and effectively. It is a deliberate design goal.
  • We use DDN GRIDScaler SFA12K-20 for our main storage – that has 1.8PB raw storage in one IBM General Parallel File System [GPFS] file system. We have learnt a lot about GPFS and how to get codes running efficiently on it. The support team and technical documentation showed how we could exploit the local page pool (cache) to run our codes much more quickly. Our system serves files over the InfiniBand fabric into the cluster at up to 10GB/s (~800TB/day). The SFA12K is already >80% full, so we’re now offloading older data to a slower, less expensive disk tier to get maximum return on the SFA12K investment.
  • Along with the 0.5PB storage pool on the sequencing cluster and other storage we now squeeze ~5PB storage and 4000 compute cores into 10 equipment racks in a small converted freezer room. Despite its small footprint, it is one of the most powerful departmental compute facilities in a UK university.
  • OCF worked with us to design a high-specification cluster and storage system that met our needs; they then delivered it and integrated it on time and to budget. In fact, the OCF-Fujitsu-Mellanox-DDN package was the clear winner from almost all perspectives – being based on GPFS, to which we were already committed; winning price / performance combination, low power consumption, fast I/O and simplicity of installation. We even managed to afford a pair of additional cluster nodes with 2TB real memory each for really complex jobs – also through OCF-Fujitsu!”
Read More »

Recent Comments

    Contact Us

    OCF plc
    Unit 5 Rotunda, Business Centre,
    Thorncliffe Park, Chapeltown,
    Sheffield, S35 2PG

    Tel: +44 (0)114 257 2200
    Fax: +44 (0)114 257 0022

    OCF Hotline: 0845 702 3829

    The Innovation Centre, Sci-Tech Daresbury,
    Keckwick Lane, Daresbury,
    Cheshire, WA4 4FS

    Tel: +44 (0)1925 607 360
    Fax: +44 (0)114 257 0022

    OCF plc is a company registered in England and Wales. Registered number 4132533. Registered office address: OCF plc, 5 Rotunda Business Centre, Thorncliffe Park, Chapeltown, Sheffield, S35 2PG

    Website Designed & Built by Grey Matter | web design sheffield