You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Snakemake offers automated caching capabilities when a user restarts the same workflow on the same workspace. Snakemake automatically reuses outputs of past rules if their inputs did not change, and re-executes only those rules that really need it. This works well and is already fully supported in REANA.
Snakemake offers another experimental caching feature for between-workflows caching. Here the cache is external to the workspaces, so it can be used when the user needs e.g. to store input files or big computations that will be reused in several independent workflows. The user can then govern Snakemake's behaviour by means of a cache: True clause in rules instructing the workflow engine to use it or not. This feature is not currently supported by REANA.
The goal of this issue is:
First, experiment with Snakemake between-workflows caching feature outside of REANA to see whether the feature works well in situations from simple ones (when only code or data is changed) to complex ones (when the container image is changed whilst "hiding" behind the same fully-qualified image name, such as the user changing the image and repushing under the same "latest" tag).
Second, investigate whether we can support this feature in REANA easily. For example, the user John Doe could set as a secret the environment variable SNAKEMAKE_OUTPUT_CACHE pointing to his EOS directory (/eos/home-j/johndoe/mysnakemakecache) that would be used for between-workflows cache storage, and the user would then add cache: True to the Snakefile rules when the cache can be activated.
We don't really need any commands to inspect the cache or otherwise manipulate its files, since the cache will be stored on a storage solution external to REANA such as EOS. Hence the users could use regular tools to access, inspect, or otherwise manage the cached content.
The text was updated successfully, but these errors were encountered:
Snakemake offers automated caching capabilities when a user restarts the same workflow on the same workspace. Snakemake automatically reuses outputs of past rules if their inputs did not change, and re-executes only those rules that really need it. This works well and is already fully supported in REANA.
Snakemake offers another experimental caching feature for between-workflows caching. Here the cache is external to the workspaces, so it can be used when the user needs e.g. to store input files or big computations that will be reused in several independent workflows. The user can then govern Snakemake's behaviour by means of a
cache: True
clause in rules instructing the workflow engine to use it or not. This feature is not currently supported by REANA.The goal of this issue is:
First, experiment with Snakemake between-workflows caching feature outside of REANA to see whether the feature works well in situations from simple ones (when only code or data is changed) to complex ones (when the container image is changed whilst "hiding" behind the same fully-qualified image name, such as the user changing the image and repushing under the same "latest" tag).
Second, investigate whether we can support this feature in REANA easily. For example, the user John Doe could set as a secret the environment variable
SNAKEMAKE_OUTPUT_CACHE
pointing to his EOS directory (/eos/home-j/johndoe/mysnakemakecache
) that would be used for between-workflows cache storage, and the user would then addcache: True
to theSnakefile
rules when the cache can be activated.We don't really need any commands to inspect the cache or otherwise manipulate its files, since the cache will be stored on a storage solution external to REANA such as EOS. Hence the users could use regular tools to access, inspect, or otherwise manage the cached content.
The text was updated successfully, but these errors were encountered: