# The gen_hosts ansible role

The `gen_hosts` ansible role is used to let us generate the top level
`hosts` file used as input for ansible for our inventory. The ansible
inventory is a set of nodes we available for our command and control we
will use for command and control.

The `gen_hosts` ansible role uses jinja2 `template` engine supported by
ansible to generate output files. A file uses the `template` jinja2 processing
by just using *one* ansible task. You provide the source file, and the output
file. That's it. You then codify in the template file what you need using
`jinja2`, and so can leverage variables in your `extra_vars.yaml` based on
your configuration.

The `gen_hosts` ansible role is *only* in charge of generating the top level
ansible `hosts` file.

## Dynamic ansible hosts files

Since the number of host we support is dynamic we must also then generate
a host file which is dynamic. The ansible `hosts` file can change per target
workflow though, we want to support different set of ansible hosts for
different workflows. This allows us to leverage support for highly dynamic
set of requirements for different workflows and lets us leverage the
high level of parallelization possible with ansible and let's us make
tasks which typically are run serially run in parallel. How you end up
splitting ansible hosts is up to you, you have to think hard about how
you can device the most parallel work for a serial workflow.

## Non dedicated workflows

There is a *super boring* default ansible hosts file generated by default
for you, which will create hosts based on your `CONFIG_KDEVOPS_HOSTS_PREFIX`
which gets promoted to an ansible variable in `extra_vars.yaml` as
`kdevops_hosts_prefix`. This is done if your workflow is not dedicated
or you know what you are doing and just want to enable different workflows
make target options and build just one host for you. You do not want
to usually use non-dedicated workflows. These are only useful for folks
who know what they are doing.

If you are adding initial support for a new workflow you can skip
the initial parallelizaiton goals and use the default hosts file. You'd copy
the default hosts template file for your workflow,
`playbooks/roles/gen_hosts/templates/hosts.j2`. So for example that is what
we have today for CXL, the `playbooks/roles/gen_hosts/templates/cxl.j2`
is pretty boring today. The only difference right now is that the generic
hosts template file got `NFS` support.

The default boring non-dedicated workflow hosts file:

* [kdevops dynamic ansible hosts file](playbooks/roles/gen_hosts/templates/hosts.j2)

## Picking which template to use

The [ansible gen_hosts main tasks file](playbooks/roles/gen_hosts/tasks/main.yml)
picks which template to use for the host file depending on a series of bools
set on the `extra_vars.yaml` file, mostly if its a dedicated workflow and if
bools for your workflow are True.

## Dedicated workflows

A goal behind kdevops is to promote high parallelization for workflows.

Workflows which have embraced these goals have dedicated workflow support, and
they reflect this by extending the number of supported dedicated workflows below
the bool Kconfig option `WORKFLOWS_DEDICATED_WORKFLOW`. If however you don't yet
have a highly parallel workflow defined you get the boring default of just one
node created for you. You can also copy and paste the same default hosts
template for your workflow if you are still figuring things out on how you
can parallleize your work.

## Parallelizing selftests

A simple example how to parallelize workflows is how kdevops supports
the Linux kernel selftsets. Although we started out with only a few kernel
selftests, and although a rule behind support for kernel seflftests was to
not have rules, a basic rule of thumb was for them to not take too long.
The default kernel selftests timeout is 45 seconds, and we already have
96 selftests which override this timeout to be greater than 45 seconds.
So now we have many kernel selftests and running all kernel selftests can
take time. Likewise kernel selftests does not treat a timeout as fatal.
We now have upstream kernel sefltests support [to override the default selftets timeout on test
runners](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f6a01213e3f812b645cd1079167bf47fc45bb0c8)
and this enables test runners like kdevops to let dedicated test workflows
override the default timeout of 45 seconds and if we want let us workflow
treat these as fatal.

What kdevops does then is to let us parallelize kernel selftests by having
a guest per selftests.

## Parallelizing fstests

Developers typically run tests for filesystems against the target profile
they are working towards. But filesystems have many options to disable or
enable new features. As new features are developed it is also important to
test with features disabled as newer kernels make new features enabled by
default. To help with all this, kdevops supports splitting up tests based
on the different target filesystem configuration supported. Linux distributions
will want to support, ie, `select` which filesystem configurations options
they support. If they don't support a feature then obviously then a user
should have to override the default to test for that distribution.

For Linux kernel upstream testing we support all features and we enable
sensible defaults for what we care to over for bugs upstream. So each
filesystem will support a different set of filesystem configurations.

## Parellizing blktests

For blktests we aim to support a guest per target block driver.

## Future enhancements to parallelization with k8

It should be possible to evaluate using kubernetes (k8) as an alternative
to just splitting up tests per target features one wants to test. Instead
one should be able to leverage as many resources as possible as one is
able to provide and then have different guests address the target test
goal. This should be possible with k8.

Some possible useful links to folks looking into this:

* https://github.com/pwyoung/ansible-playbook-deploy-kubespray
* https://github.com/pwyoung/nomaj