Skip to content

Latest commit

 

History

History
122 lines (95 loc) · 6.01 KB

the-gen-hosts-ansible-role.md

File metadata and controls

122 lines (95 loc) · 6.01 KB

The gen_hosts ansible role

The gen_hosts ansible role is used to let us generate the top level hosts file used as input for ansible for our inventory. The ansible inventory is a set of nodes we available for our command and control we will use for command and control.

The gen_hosts ansible role uses jinja2 template engine supported by ansible to generate output files. A file uses the template jinja2 processing by just using one ansible task. You provide the source file, and the output file. That's it. You then codify in the template file what you need using jinja2, and so can leverage variables in your extra_vars.yaml based on your configuration.

The gen_hosts ansible role is only in charge of generating the top level ansible hosts file.

Dynamic ansible hosts files

Since the number of host we support is dynamic we must also then generate a host file which is dynamic. The ansible hosts file can change per target workflow though, we want to support different set of ansible hosts for different workflows. This allows us to leverage support for highly dynamic set of requirements for different workflows and lets us leverage the high level of parallelization possible with ansible and let's us make tasks which typically are run serially run in parallel. How you end up splitting ansible hosts is up to you, you have to think hard about how you can device the most parallel work for a serial workflow.

Non dedicated workflows

There is a super boring default ansible hosts file generated by default for you, which will create hosts based on your CONFIG_KDEVOPS_HOSTS_PREFIX which gets promoted to an ansible variable in extra_vars.yaml as kdevops_hosts_prefix. This is done if your workflow is not dedicated or you know what you are doing and just want to enable different workflows make target options and build just one host for you. You do not want to usually use non-dedicated workflows. These are only useful for folks who know what they are doing.

If you are adding initial support for a new workflow you can skip the initial parallelizaiton goals and use the default hosts file. You'd copy the default hosts template file for your workflow, playbooks/roles/gen_hosts/templates/hosts.j2. So for example that is what we have today for CXL, the playbooks/roles/gen_hosts/templates/cxl.j2 is pretty boring today. The only difference right now is that the generic hosts template file got NFS support.

The default boring non-dedicated workflow hosts file:

Picking which template to use

The ansible gen_hosts main tasks file picks which template to use for the host file depending on a series of bools set on the extra_vars.yaml file, mostly if its a dedicated workflow and if bools for your workflow are True.

Dedicated workflows

A goal behind kdevops is to promote high parallelization for workflows.

Workflows which have embraced these goals have dedicated workflow support, and they reflect this by extending the number of supported dedicated workflows below the bool Kconfig option WORKFLOWS_DEDICATED_WORKFLOW. If however you don't yet have a highly parallel workflow defined you get the boring default of just one node created for you. You can also copy and paste the same default hosts template for your workflow if you are still figuring things out on how you can parallleize your work.

Parallelizing selftests

A simple example how to parallelize workflows is how kdevops supports the Linux kernel selftsets. Although we started out with only a few kernel selftests, and although a rule behind support for kernel seflftests was to not have rules, a basic rule of thumb was for them to not take too long. The default kernel selftests timeout is 45 seconds, and we already have 96 selftests which override this timeout to be greater than 45 seconds. So now we have many kernel selftests and running all kernel selftests can take time. Likewise kernel selftests does not treat a timeout as fatal. We now have upstream kernel sefltests support to override the default selftets timeout on test runners and this enables test runners like kdevops to let dedicated test workflows override the default timeout of 45 seconds and if we want let us workflow treat these as fatal.

What kdevops does then is to let us parallelize kernel selftests by having a guest per selftests.

Parallelizing fstests

Developers typically run tests for filesystems against the target profile they are working towards. But filesystems have many options to disable or enable new features. As new features are developed it is also important to test with features disabled as newer kernels make new features enabled by default. To help with all this, kdevops supports splitting up tests based on the different target filesystem configuration supported. Linux distributions will want to support, ie, select which filesystem configurations options they support. If they don't support a feature then obviously then a user should have to override the default to test for that distribution.

For Linux kernel upstream testing we support all features and we enable sensible defaults for what we care to over for bugs upstream. So each filesystem will support a different set of filesystem configurations.

Parellizing blktests

For blktests we aim to support a guest per target block driver.

Future enhancements to parallelization with k8

It should be possible to evaluate using kubernetes (k8) as an alternative to just splitting up tests per target features one wants to test. Instead one should be able to leverage as many resources as possible as one is able to provide and then have different guests address the target test goal. This should be possible with k8.

Some possible useful links to folks looking into this: