RSECon22 Walkthrough: A FAIR Data Pipeline: provenance-driven data management for traceable scientific workflows
This repo contains the material for the RSECon22 walkthrough titled: "A FAIR Data Pipeline: provenance-driven data management for traceable scientific workflows". The walkthrough uses a docker container and Jupyter Labs (formally, a notebook) to run through an example usage of the FAIR Data Pipeline. You can view the walkthrough presentation in the Society of Research Software Engineering channel.
The only prerequisite is an installation of Docker, which is available free from docker.com.
The docker container is available on the GitHub Package Registry and can be pulled using the following command(s):
docker pull ghcr.io/fairdatapipeline/rsecon:latest
docker pull ghcr.io/fairdatapipeline/rsecon:aarm64
The container can then be run using the following command:
docker run -p 8000:8000 -p 8888:8888 ghcr.io/fairdatapipeline/rsecon:latest
OR
docker run -p 8000:8000 -p 8888:8888 ghcr.io/fairdatapipeline/rsecon:aarm64
Once the container has started, there will be an address to access the Jupyter Lab within the console. This address will include a token for authentication to the Jupyter Labs page. The link will take the form of: http://127.0.0.1:8888/lab?token=<token>
.
This address can then be accessed through your web browser to give you access to the Jupyter Lab installation.
N.B. The container will bind the ports 8000
and 8888
so please make sure these ports are available.
Some package requirements and packages have been pre-installed in the interest of saving time.
The docker container contains 8 Jupyter Notebooks detailed below.
This notebook contains codeblocks to install the FAIR Command Line Interface (CLI) and the FAIR Local Registry.
The notebooks contain code blocks to run the SEIRS model example in different languages: they contain code to register
inputs and run
the models.
All the models use the same input and therefor the pull
code block only needs to run from one of the files.
Code blocks to clone the simple model repo, install the simple model package, initalise a Fair repository, register ('pull') the inputs for the model and then 'run' the model in python.
Code blocks to initialise a fair repository, register ('pull') the inputs for the model and then 'run' the model in C++. The C++ repo has already been cloned and the executable has been compiled.
Code blocks to initialise a fair repository, register ('pull') the inputs for the model and then 'run' the model in JAVA. The Jave repo has already been cloned and the project pre built.
Code blocks to initialise a fair repository, register ('pull') the inputs for the model and then 'run' the model in Julia. The Julia repository has been cloned into the docker container and the julia package has already been initialised.
Code blocks to initialise a fair repository, register ('pull') the inputs for the model and then 'run' the model in R. The R repo has already been cloned and the R Package installed
The SEIRS models can be compared and graphed using the following notebook.
Code block to run
a comparison of the simple models producing a graph.
The local registry can be explored by running it and navigating to the web interface at: 127.0.0.1:8000
Notebook to start and stop registry