Kokkos Resilience is an experimental extension to Kokkos for providing convenient resilience and checkpointing to scientific applications.
Kokkos Resilience is built using CMake version 3.17 or later. It has been tested on compilers such as GCC 11.2.0 and LLVM/Clang 11.0.0. It should work on any C++14 supporting compiler, but your mileage may vary.
First and foremost, Kokkos Resilience requires an install of Kokkos. This can be compiled or a version bundled with other software (such as Trilinos) or as a package on a machine.
Note: Kokkos Resilience currently requires the develop branch of Kokkos for compile-time view hooking capabilities.
Kokkos-resilience uses Boost for a replacement for some C++17 features such as the filesystem library, std::optional
, and std::variant
.
This dependency will likely be removed in the future when Kokkos requires C++17.
Additionally, Kokkos Resilience uses the Veloc library for efficient asynchronous checkpointing. If you desire automatic checkpointing to be available this library (and additionally MPI) must be installed.
We are maintaining a special spack package for VeloC since the main one is not up-to-date. It can be found here and can be installed via:
git clone git@gitlab-ex.sandia.gov:kokkos-resilience/kr-spack.git
spack repo add kr-spack
spack install veloc@barebone
It is recommended to install the "barebone" variant/branch of VeloC since it has reduced dependencies.
It is recommended to use the CMake presets to configure the project. More information on presets can be found here. Note that CMake 3.19 or higher is required to use presets, and to inherit from presets bundled with Kokkos Resilience, you need at least CMake 3.21.
Kokkos Resilience includes a set of presets in CMakePresets.json
. These can be inherited from and represent common aaplication configurations.
Path | Description |
---|---|
Kokkos_ROOT | Path to the root of the Kokkos install |
VeloC_ROOT | Path to the root of VeloC if it is enabled (see below) |
HDF5_ROOT | Path to the root of HDF5 if HDF5 is enabled (see below) |
Variable | Default | Description |
---|---|---|
KR_ENABLE_VELOC | ON | Enables the VeloC backend |
KR_VELOC_BAREBONE | OFF | Enable VeloC barebone mode |
KR_ENABLE_TRACING | OFF | Enable performance tracing of resilience functions |
KR_ENABLE_STDIO | OFF | Use stdio for manual checkpoint |
KR_ENABLE_HDF5 | OFF | Add HDF5 support for manual checkpoint |
KR_ENABLE_HDF5_PARALLEL | OFF | Use parallel version of HDF5 for manual checkpoint |
KR_ENABLE_TESTS | ON | Enable tests in the build |
KR_ENABLE_EXAMPLES | ON | Enable examples in the build |
Kokkos Resilience is designed to work with CMake projects, so using CMake is typically much easier. In your own project, call:
find_package(resilience)
target_link_libraries(target PRIVATE Kokkos::resilience)
Ensure that the build or install directory of Kokkos Resilience is in CMAKE_PREFIX_PATH
, or the variable
resilience_ROOT
points to the build/install directory, or the variable resilience_DIR
points to the location of
the Kokkos Resilience resilienceConfig.cmake
file. This file is located in the root build directory of Kokkos
Resilience or the path <install directory>/share/resilience/cmake
. See the
CMake documentation for more details on how packages
are found.