CERN, the leading European Scientific Center, has expanded its data storage system to more than 1 million terabytes (TB) of total volume in preparation for a new round of experiments on the collision of ions. The total amount of data storage now exceeds the exabyte (EB) mark, with the majority of data stored on hard drives, although the use of flash drives is also increasing.
CERN explained that while increasing capacity is important, timely access to data is also crucial. With the recent update, data reading speed has reached 1TB/s, a significant achievement in terms of performance.
The update added 289 petabytes (PB) of storage capacity compared to last year, and it was implemented to support the latest round of experiments with heavy ions in the large Hadron Collider at CERN. The experiments involve colliding heavy ions at near-light speed to study the fundamental building blocks of the universe.
These tests, taking place over several years at the particle ring accelerator near Geneva, Switzerland, are expected to generate a massive amount of data – over 600PB – which must be processed before being sent for long-term storage on magnetic tapes.
Despite the large volume of petabytes of data, thanks to high-storage capacities, it does not require a significant amount of physical space. A petabyte of storage can now fit in a single building. However, storing an exabyte of data is a more complex task, requiring rows of racks filled with disk shelves.
The CERN storage system consists of approximately 111,000 devices, most of which are hard disks, but with an increasing number of flash drives. The systems operate on EOS, an open-source platform developed by CERN specifically for use with the large Hadron Collider and other scientific tasks.
It’s important to note that achieving 1 exabyte would require 100,000 disks with a capacity of 10TB each, and this storage system did not come together overnight. In fact, the storage capacity has increased 56 times compared to the initial 18PB storage system in 2010, with a more than 2-fold increase since 2020.
With over 100,000 disks, drive failures are a common occurrence. Previously, CERN had to replace 30 disks per week, necessitating careful planning and data replication methods.