Sustainable Packaging of Quantum Chemistry Software with the Nix Package Manager

The installation of quantum chemistry software packages is commonly done manually and can be a time-consuming and complicated process. An update of the underlying Linux system requires a reinstallation in many cases and can quietly break software installed on the system. In this paper, we present an approach that allows for an easy installation of quantum chemistry software packages, which is also independent of operating system updates. The use of the Nix package manager allows building software in a reproducible manner, which allows for a reconstruction of the software for later reproduction of scientific results. The build recipes that are provided can be readily used by anyone to avoid complex installation procedures.


I. INTRODUCTION
Open-source quantum chemistry program packages are usually compiled manually on a work station, single compute node, or a high-performance computing system.This process can be time-consuming and complex, especially when it has to be carried out for multiple programs.
Such a manual installation is in general hard to replicate, unless its preparation and the use of configuration flags has been meticulously documented.A major problem with such an approach is that the program package will depend on operating system libraries, unless it has been linked completely statically.As a consequence, an update of the operating system or any other dependency may quietly break the software package and eventually make a rebuild necessary.
Another issue that arises from such a manual build strategy is the fact that scientific results can not be exactly reproduced.To rebuild an old version in exactly the same way, one needs also the exact states of all dependencies.This problem can in principle be solved by container solutions [47], such as docker [46] or singularity [42], but only works as long as the container image is preserved.Another downside of container solutions is that they usually ship with numerous system libraries which * markus.kowalewski@fysik.su.se † phillip.seeber@uni-jena.deneed to be updated or reproduced in case of an image rebuild.
The later reproduction of scientific results that were obtained via a computer program requires not only to follow the computational procedure, but also the exact same version of the program.This can only be guaranteed if one is able to reproduce the executable of the program.Package managers, which aim at creating reproducible build environments, are Nix [29] and Guix [7,27].These package managers are built around functional languages, which are used to create build recipes.
Such a build recipe is represented by a functional expression with inputs (e.g., other packages) and outputs (a path with the build product), which are tracked with cryptographic hashes.This approach to package management allows to uniquely identify a particular build of a program and provides all prerequisites to accurately reproduce the build at a later point in time.Exact binary reproducibility is difficult to achieve with traditional package managers, but can be achieved with the mechanisms provided by Nix.
In this paper, we show how the Nix package manager can be used to manage software in a sustainable and reproducible way.Our approach focuses mainly on quantum chemistry software packages, but can be applied to any software.We will introduce the NixOS-QChem overlay, which is an add-on to the nixpkgs collection for in-tegrating quantum chemistry software into nixpkgs environment and providing optimized versions of the packages.The nixpkgs collection is a good starting point since it already provides ≈60,000 packages.The add-on repository provides more build recipes for open source and proprietary software packages.
The paper is organized as follows.In sec.II we give a general overview of the Nix package manager and its features.In sec.III we describe the approach to integrate quantum chemistry software packages along with a list of packaged software (sec.III A), followed by a set of examples in sec.III B. In sec.IV we discuss how our approach compares to other packaging methods.

MANAGER
The Nix package manager [29] is built around the Nix functional language [10] and a set of packages can be represented by a set of functions, which eventually evaluate to file system paths.A build recipe is called a Nix expression in the Nix terminology.Such Nix expressions are functions that evaluate to derivations, which uniquely describe the build process.Nix expressions can have one or more inputs and one or more outputs.The inputs are commonly other packages that provide dependencies, such as libraries.The output is a path to a final build product, such as the binary files.The name of the output path is derived from a hash function over the derivation itself and all its inputs, which creates a unique path name.
The Nix package manager stores its packages (i.e., its output paths) under a fixed path in the file system, /nix/store, which is simply called the Nix store.Packages are not allowed to refer to dependencies outside the Nix store, thus avoiding dependencies with the system software.All builds that are stored in the Nix store are immutable and can not be changed after a build is completed, thus guaranteeing full stability of a given package.Nix is, by design, a package manager which builds packages from source.If the output of a build is not available in the local Nix store or in a remote binary cache, it will be built from source.This means that if a dependency of a package changes, the package will be rebuilt and potential problems with the update are either avoided or uncovered.
Note that the Nix approach differs substantially from the use of containers, which only statically bundle dependencies but provide no further mechanism to update, rebuild and maintain the contents of a container.
The second important component, besides Nix, is the nixpkgs package collection [11], which provides over 64,000 packages in the form of Nix expressions [16] (including several quantum chemistry programs) and provides the basis for our work.The packages provided by the nixpkgs package collection are also available in form of binaries through a binary cache and thus do not need to be build from source by the end-user.This package set can be extended by the user with the help of overlays that allows us to add, modify, or replace packages.We will introduce NixOS-QChem overlay in sec III, and show how it is used to add and optimize packages.

A. Nix in an HPC environment
In this section, we address the challenges that arise in a high performance computer cluster environment and how they can be addressed with Nix.
Environment modules [6] are an approach commonly used by super computing centers and on scientific computer clusters to create on-demand environments for users.A module sets environment variables pointing to the requested package paths upon a module load <package> call.However, a significant shortcoming of this approach is that it does not track dependencies between modules or any dependencies with system libraries.As a consequence, even a minor system update, addressing only security updates, may silently break installed packages or software compiled by a user.
The Nix package manager explicitly addresses these shortcomings.Users can choose to build their own packages with Nix or use a centrally provided package set.
These user builds are independent of the operating system's software or centrally installed packages.The immutability of the Nix store guarantees that a dependency on an existing package (Nix store path) can never be quietly replaced.This allows users to pin a package and all its dependencies to a fixed version, providing stability and reproducibility of the binaries.When a dependent derivation is replaced or upgraded, it forces the rebuild of all derivations which depend on it, ensuring a valid build.The Nix package manager also provides an easy path for portability: packages can be transferred between different compute cluster systems (assuming that the Nix package manager is installed on both systems) either as binary or by automatically rebuilding the required packages from their source.Nix store paths and their dependencies can be transferred between machines by means of a custom copy command (nix copy --[to|from] <machine> <store path>).
Note that a proper installation of the Nix package manager requires administrator rights and thus has to be carried out by a system administrator.The Gricad facility in Grenoble has demonstrated [26] how the Nix package manager can be used on a computer cluster with a shared Nix store.Nix has been used for example at CERN to distribute software for LHCb [25].

NIXPKGS
To customize nixpkgs for use with quantum chemistry software packages, we make use of the overlay mecha- The nixpkgs repository provides a simple mechanism to switch between libraries, either on a per-package ba-sis or globally, for the whole package set.One example is the message passing interface system (MPI) [32], which is provided by different implementations [12].The default implementation is OpenMPI [34] but it can be readily replaced by the overlay mechanism.The following example shows the Nix code for an overlay that replaces OpenMPI globally with MVAPICH [50] and builds the CP2K [43] package explicitly with MPICH [37]: self : super : { mpi = super .mvapich ; cp2k = super .cp2k .override { mpi = self .mpich ; }; } Linear algebra libraries, such as BLAS and LAPACK can be replaced in a similar way.Nixpkgs has a wrapper for BLAS and LAPACK [21,24], which provides custom libraries through the standard interface.The default implementation is OpenBLAS [14], but Intel's MKL [8] or AMD's blis/libFlame [3] are also available.The following example demonstrates how an overlay can be used to replace BLAS and LAPACK with MKL: • default.nix: the base of the overlay • cfg.nix: defines all configuration options for the overlay • nixpkgs-opt.nix:defines all packages from the nixpkgs collections that are projected into the qchem subset and are subject to processor dependent optimisations.
• tests/: folder with tests for various packages.
• examples/: folder with examples showing different configuration scenarios.
• pkgs/: contains sub folders with Nix expressions for additional packages.
• install.sh:installs Nix, nixpkgs and the NixOS-QChem overlay.the DebiChem team often provides valuable knowledge and patches, the architecture and philosophy of Debian packaging prevents clean isolation and tight integration between packages.Note that traditional package managers such as the Debian package manager are meant to be operated by system administrators and thus provide no straight forward way for end-user installations on a shared computer cluster.

B. Usage examples
We will outline the basic installation procedure of the Nix package manager and the overlay for a simple setup on a single machine.For the setup on a compute cluster, we refer to the setup of the Gricad team [26] for a shared Nix store.A multiuser installation of Nix can be obtained with the following commands: # To be executed by an admin # Multi -user i n s t a l l a t i o n of Nix .# Will request root privilege s for the inital setup .sh $ > curl -L https :// nixos .org / nix / install | sh -s ----daemon These commands will install Nix in multiuser mode; the Nix daemon will listen for evaluation requests from the Nix commands and execute builds or download the store paths from a binary cache.
The packages in NixOS-QChem can be accessed with different methods.We will discuss two main methods here: as a direct system-wide or user-installed overlay to the nixpkgs channel and explicit use as a project-based package source.The first method allows for a direct use of the latest package versions, while the second method allows to fix the version on a per-project basis.Other options to access NixOS-QChem overlay packages, which we will not discuss here further in detail, are the Nix user repositories (NUR) [13]

d. Project-Based Calculation Environment with Fixed
Versions Computational environments, that are associated with a specific project, can strongly benefit from fixing all package versions in a custom environment.

Projects can use different versions or variations of pro-
grams without interfering with a system level package set.Such a computational environment can in principle be defined in a single Nix file and thus be easily shared between coworkers.Fixing all program versions in such an environment also allows reproducing its results at a later point in time.Such an environment can be described by a shell.nixfile, which defines an environment for a nix-shell.To achieve reproducibility, the versions of nixpkgs and NixOS-QChem must be fixed: This mechanism can also be used to write SLURM (or other resource management system) scripts for working in a computer cluster environment.Such batch scripts can either pull in packages via nix-shell's -p option or can be combined with a project-associated shell.nixfile # !/ usr / bin / env nix -shell # !nix -shell / path / to / project / shell .nix -i bash Reusing the environment from the project's shell.nixThese scripts can also be easily transferred between Nix enabled computing centers.

IV. COMPARISON WITH OTHER SOLUTIONS
Several approaches for managing software environments in high-performance computing have emerged, each offering specific advantages or disadvantages, inherited by their fundamental design.In the following, we will compare the Nix packaging approach to other common approaches for managing software environments in high-performance computing.The first category are distribution based package managers, such as the Debian package manager, Pacman for Arch Linux, and the Redhat package manager.Their main purpose is to provide software packages for a system-wide installation.
A second approach is Spack, a solution explicitly de-  This can lead to increased storage requirements.Nix packages are handled by a central daemon that controls access to the system-wide Nix store and executes builds on demand.This makes Nix an inherently multiuser solution and allows every user to build of software, while Nix is intrinsically aware of shared dependencies.Customized Nix packages can be made available by system administrators or can be created by the user.The presented approach is focused on applications for the theoretical chemistry community, but the general principle is of broad applicability.We think that many scientific applications would benefit from the Nix ap- We encourage users and developers of scientific software to contribute to NixOS-QChem and nixpkgs as well as to report bugs.It would be a great advantage if more computing facilities will adopt the approach and provide a Nix installation to allow for more reproducible compute environments.

VI. ACKNOWLEDGMENTS
The authors would like to thank the nixpkgs community for providing the software infrastructure that made this work possible.Phillip Seeber gratefully acknowledges the financial support provided by the German Research Foundation within the TRR CATALIGHT -Projektnummer 364549901-TRR234 (project C5).
nism, which allows us to extend and modify the package set provided by nixpkgs.Note that many scientific libraries and some quantum chemistry packages are already packaged in nixpkgs.These packages can be used directly after the installation of the Nix package manager.Our overlay, NixOS-QChem [1], is thus tightly coupled to nixpkgs.The overlay serves multiple purposes: it selects quantum chemistry related software packages for optimization and adds additional quantum chemistry software packages that are not available in nixpkgs.The overlay also serves as an incubator for new packages that need to be matured first with respect to its integration into the nixpkgs environment.This includes packages that have non-standard build systems and are thus more difficult to integrate.The aim is to integrate a useful variety of quantum chemistry packages into nixpkgs collection and to maintain a high code quality of the corresponding nixpkgs guidelines.NixOS-QChem focuses on providing packages for the x86-64 CPU architecture on the Linux platform, as this is currently the most common architecture for scientific high-performance computing[17].The overlay also provides optional performance optimizations, which make use of modern x86-64 processors, that are not provided by nixpkgs itself due to compatibility reasons.The optimizations allow for setting custom compiler flags and to automatically select optimization flags provided by individual packages.All packages provided through the overlay are projected into a package subset (name prefix: qchem), which allows to also optimize basic libraries, such as the fftw library[33], without causing the rebuild of non-scientific software packages.Open source packages can be downloaded automatically from the internet, but proprietary packages which require a license need to be provided by the user.For these cases, the overlay also provides a mechanism which allows for downloading from a custom, internal location.As a result, NixOS-QChem can provide Nix expressions (and builds) for commercial packages -such as Turbomole, Molpro and others -as well as packages that require user registration -such as CFour, MRCC, and ORCA[45,48,61].While such packages often exclude the hurdles of compilation, packaging them enhances their composability.Composing different major software packages in a single, coherent environment often proves difficult: the problems range from different providers of MPI and BLAS/LAPACK implementations for different codes and name conflicts in a global $PATH (e.g.libblas.so,mpiexec, ...), over different version constraints of the same dependencies (e.g.different version constraints of numpy[38] in different python packages, that cannot be fulfilled simultaneously).In those cases correct behaviour can become dependent on detailed choices, such as in which order different environment modules are loaded.For selected set of packages, we have implemented automated tests in the overlay, which ensure that the basic functionality of a package is still given after an update or a rebuild.These tests are less comprehensive than the test suites provided by individual quantum chemistry packages, but aim at uncovering potential problems in connection with dependencies that have been observed during the integration.
self : super : { blas = super .blas .override { b l a s P r o v i d e r = self .mkl ; }; lapack = super .lapack .override { l a p a c k P r o v i d e r = self .mkl ; }; } The Nix code in the NixOS-QChem overlay [1] is structured as follows: launching an isolated shell, interactive use of a program, or the noninteractive execution of programs in a resource manager like SLURM.To exemplify some common use cases, we will refer to illustrative examples in the following.a. Interactive Program Usage (Turbomole) : Turbomole uses a set of interactive programs, such as define and eiger, to create input files and analyse output files.Furthermore, Turbomole requires environment variables such as $TURBODIR and $PARA_ARCH to be set.An interactive nix-shell makes the Turbomole package available and reduces the required user input, by wrapping Turbomole with appropriate environment variables and settings: # starts a interative nix -shell with Turbomole sh $ > nix -shell -p qchem .turbomole # Turbomole commands can directly be used # normal interacti o n with define # e .g .set up a RI -ADC (2) calculati o n nix -shell $ > define # ground state calculati o n nix -shell $ > ridft -smpcpus 4 # excited state calculat i on nix -shell $ > ricc2 -smpcpus 4 # interact i ve overview of results nix -shell $ > eiger # will drop back to normal bash nix -shell $ > exit b.Non-Interactive Calculation (Molcas) A noninteractive, Molcas calculation with OMP parallelism can directly be executed from a nix-shell.The PyMolcas driver requires specific Python packages, such as six, to be installed.Instead of globally installing Python dependencies, the Nix expression wraps the python scripts in an isolated Python runtime environment and can be used directly: sh $ > nix -shell \ -p qchem .molcas \ --run " O M P _ N U M _ T H R E A D S =4 pymolcas molcas .inp " c.Interactive Python Session (MEEP) Some scientific Python packages may be used interactively within an interpreter, e.g. to experiment with different settings.Packages such as MEEP[49], that provide a Python API around a C/C++ code, are often difficult to install; they are not available from PyPi and require both Python and C/C++ tooling.MEEP can be used interactively from Python within a nix-shell:sh $ > nix -shell \ -p python3 python3 .pkgs .numpy \ qchem .python3 .pkgs .meep \ --run " python3 " python3 $ > import numpy as np python3 $ > import meep as mp python3 $ > # ...
let# Reproducible , pinned import of the # NixOS -QChem overlay function gh = " https :// github .com " ; qchemOvl = import ( builtins .fetchGit { url = " $ { gh }/ markuskow a / NixOS -QChem .git " ; name = " NixOS -QChem_2021 -09 -25 " ; rev = " 9604 e 9 b 7 f 8 d 6 e a 6 8 f 0 7 d 6 2 1 e 1 f 7 0 a 9 e b f 8 5 7 e f a 0 " ; ref = " master " ; }); nixpkgs = import ( builtins .fetchGit { url = " $ { gh }/ NixOS / nixpkgs .git " ; name = " nixpkgs_2021 -09pkgs ; mkShell { ... } Here, the fetchGit function is used to access a specific version of the overlay, and the nixpkgs package set (fixed by the respective rev statements).Alternatively, the Niv tool [9] provides a convenient command line interface to automate the version fixing and update processes.The overlay and configuration settings are applied explicitly in the shell.nixfile.For the rather verbose, full example of the shell.nixfile and the usage of Niv, we refer to [15].The shell.nix file can either be referenced implicitly by executing nix-shell in the same directory, or explicitly by nix-shell /path/to/shell.nix.e. Reproducible Jupyter Notebooks Jupyter notebooks [41] are commonly used tools for experimentation with codes and methods, the development of scientific ideas, as well as for visualization of data.However, distributing Jupyter notebooks can be difficult, since the environment and all dependencies, such as Python packages, also need to be reproduced.Like in the previous example, nix-shell be used to make Jupyter notebooks reproducible.Using version fixing, as in the example above, a Jupyter environment for GPAW simulations can be formulated in a shell.nixfile: let qchemOvl = ... pkgs = ... p y t h o n W i t h P a c k a g e s = pkgs .qchem .python3 .w i t h P a c k a g e s ( p : with p ; [ numpy jupyterla b ipympl gpaw ]); in with pkgs ; mkShell { buildInpu t s = [ p y t h o n W i t h P a c k a g e s ]; shellHook = " jupyter -lab "; }Executing nix-shell will then directly open the Jupyter-Lab interface in the browser and allow using packages such as GPAW, along with Python and all the necessary Python packages.The complete examples can be found in Ref. [15].f.Self-Contained Programs and Shell Scripts The nix-shell command can be used as the shebang line of scripts.This allows to write small, reproducible, selfcontained scripts and programs, or to write scripts in the scope of a project-associated shell.nixfile.The following example shows a self-contained Python script for data visualization: # !/ usr / bin / env nix -shell # !nix -shell -i python3 # !nix -shell -p p y t h o n 3 P a c k a g e s .numpy # !nix -shell -p p y t h o n 3 P a c k a g e s .matplotli b import numpy as np import matplotlib .pyplot as plt xs = np .linspace ( -2 , 2 , num =100) plt .plot ( xs , np .exp ( -xs **2)) plt .sh ow ()Note, that here we use the latest versions of numpy and matplotlib as they are provided directly by nixpkgs.
e. Multi-User installation Spack, Singularity, and environment modules, as dedicated solutions for cluster computers, offer the option to provide software both centrally (e.g. a minimum set of basic tools and libraries from the local computer cluster environment), as well as on a per-user basis (customized or domain-specific software).Neither approach takes advantage of shared common dependencies, which are identical between packages.

V
. CONCLUSION AND OUTLOOK The nixpkgs set and the NixOS-QChem overlay provide numerous scientific packages and packages relevant for quantum chemistry.The presented solution makes these programs easily available without complicated installation or manual compilation procedures.Proprietary packages can also be made available without explicit installation if the user has obtained a license and has the corresponding installation file.The NixOS-QChem overlay is configurable and allows for optimization depending on the used processor architectures.The option to build self-contained scripts and batch jobs has proven itself highly useful in daily use.Its multiuser capability allows every user to prepare customized and reproducible compute environments.The presented examples demonstrate how to create reproducible environments for electronic structure calculations as for scripted pre-and post-processing tasks.
proach.Reproducible environments are not only useful for users of scientific software, but are also helpful during the development of software.Future developments of the NixOS-QChem overlay will aim at integrating more quantum chemistry software packages with Nix.The still experimental but under development and upcoming "Flakes" feature [2] will simplify the creation of reproducible environments with Nix even further.

TABLE I :
List of selected quantum chemistry packages and utilities provided by the overlay.
software from source, and the builds are executed on demand.The package set is not restricted to free software and includes also Nix expressions for proprietary packages.Many packages profit from this integrated packaging, and composing coherent runtime environments is simplified.Noteworthy examples for improved composability are the Pysisyphus optimiser [58], which wraps Turbomole, ORCA, and Psi4 among others, or the polarizable LICHEM QM/MM implementation, which relies on the Tinker MM engine and the Gaussian, NWChem, and Psi4 quantum chemistry codes.The SHARC surface hopping code, which depends on electronic structure codes, can be used conveniently from NixOS-QChem.prietary packages makes NixOS-QChem unique among such packaging efforts.With the DebiChem project of the Debian GNU/Linux distribution [5], another major packaging effort for chemical software exists.While