Automated, Reliable, and Efficient Continental-Scale Replication of 7.3 Petabytes of Climate Simulation Data: A Case Study
arxiv(2024)
摘要
We report on our experiences replicating 7.3 petabytes (PB) of Earth System
Grid Federation (ESGF) climate simulation data from Lawrence Livermore National
Laboratory (LLNL) in California to Argonne National Laboratory (ANL) in
Illinois and Oak Ridge National Laboratory (ORNL) in Tennessee. This movement
of some 29 million files, twice, undertaken in order to establish new ESGF
nodes at ANL and ORNL, was performed largely automatically by a simple
replication tool, a script that invoked Globus to transfer large bundles of
files while tracking progress in a database. Under the covers, Globus organized
transfers to make efficient use of the high-speed Energy Sciences network
(ESnet) and the data transfer nodes deployed at participating sites, and also
addressed security, integrity checking, and recovery from a variety of
transient failures. This success demonstrates the considerable benefits that
can accrue from the adoption of performant data replication infrastructure.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要