Towards reproducible software environments in HPC
Date:
Changed on 06/05/2020
Reproducibility is an important topic of scientific discussion. In the computer science, more and more experiments rely on sets of complex software. The ability to reproduce an experiment thus depends on the ability to reproduce the software environment. In this context, the National Science Foundation (NSF) in the United States now encourages experiments carried out by the HPC community to be reproduced, and journals such as Nature are also insisting on the importance of sharing source codes and supporting reproducibility. Some HPC conferences such as SuperComputing have directives on the subject. Finally, Inria's latest strategic plan devotes an entire chapter to the topic.
In parallel with Guix, there are two other types of tools: “traditional” package managers and “containers”.
Guix thus offers an alternative solution to traditional package managers and containers. It can be used to reproduce a software environment without the need for the system's package manager. The Inria Bordeaux – Sud-Ouest research centre, Max Delbrück Institute in Berlin and the University Medical Center in Utrecht seek to optimise this software for HPC. This is done by adding packages for HPC software that were developed and used at each of the institutions, but also and above all by adding functionalities that facilitate its use on a computing cluster and implementing reproducible workflows.
Ricardo Wurmus, system administrator of the Scientific Bioinformatics Platform at the Max Delbrück Institute, uses Guix. “ ////////.”
At the Inria Bordeaux – Sud-Ouest research centre, Ludovic Courtès, an engineer in the Experimentation and development Section, is responsible for optimising the software for HPC. With support provided by a technology development initiative from Inria, its long-term objectives are that the software meet the HPC requirements of the Centre's research teams and that it is compatible with computer clusters like the one hosted and used at the Bordeaux centre: PlaFRIM.
The project is scheduled to last two years. By that time, the project's initiators hope to have met the software reproducibility needs of their institutions. The wider objective is to convince other HPC decision makers of the advance that this approach represents.