This directory contains all files pertaining to our own implementation of an E2E testing framework for AgentBaker.
E2E testing for Linux is currently implemented using a Golang framework built from the ground-up. Note that we soon plan on moving Windows over to this testing framework as well.
The goal of E2E testing with AgentBaker is to ensure that the node bootstrapping artifacts generted and returned by the primary AgentBaker API not only contain expected content, but also contain correct content that can be used as-is to bootstrap real Azure VMs so they can join real AKS clusters.
From a high-level, each E2E scenario makes a call out to the primary node-bootstrapping API GetLatestNodeBootstrapping with a set of parameters (represented by a NodeBootstrappingConfiugration) which define the given scenario to generate CSE and custom data. A new VMSS containing a single VM will then be created and associated with an AKS cluster that is already running in the Azure. The CSE and custom data generated by AgentBaker will then be applied to the new VM such that it can be properly bootstrapped and register itself with the apiserver of the running cluster. Liveness and health checks and then run to make sure the new VM's kubelet is posting NodeReady to the cluster's apiserver, and that workload pods can successfully be run on it. Lastly, a set of validation commands are remotely executed on the VM after it has successfully been bootstrapped to ensure that its live state (file existsnce, sysctl settings, etc.) is as expected.
Note: if you have changed code or artifacts used to generate custom data or custom script extension payloads, you should first run make generate
from the root of the AgentBaker repository.
To run the Go implementation of the E2E test suite locally, simply use e2e-local.sh
. This script will setup the call to go test
for you while also implementing default logic for a set of required environment variables used to interact with Azure. These required environment variables include:
SUBSCRIPTION_ID
- default8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8
RESOURCE_GROUP_NAME
- defualt:agentbaker-e2e-tests
LOCATION
- default:eastus
CLUSTER_NAME
- defaultagentbaker-e2e-test-cluster
AZURE_TENANT_ID
- default:72f988bf-86f1-41af-91ab-2d7cd011db47
SCENARIOS_TO_RUN
may also optionally be set to specify a subset of the E2E scenarios to run during the testing session as a comma-separated list, for example:
SCENARIOS_TO_RUN=base,gpu ./e2e-local.sh
Furthermore, SCENARIOS_TO_EXCLUDE
may also optionally be set to specify the set of scenarios which will be excluded from the testing session as a commma-separated list. If both SCENARIOS_TO_RUN
and SCENARIOS_TO_EXCLUDE
are specified, SCENARIOS_TO_RUN
will take precedence.
KEEP_VMSS
can also be optionally specified to have the test suite retain the bootstrapped VMSS VMs for further debugging. When this option is specified, the private SSH key used to bootstrap the VMs will be included within each scenario's log bundle.
NOTE: if this option is specified please make sure to manually delete your bootstrapped VMs later. Though, all bootstrapped VMs will eventually be deleted by the ACS test GC regardless.
Note that when using e2e-local.sh
, a timeout value of 30 minutes is applied to the go test
command.
You may also run the test command yourself assuming you've properly setup the required environment variables like so:
go test -timeout 30m -v -run Test_All ./
The top-level package of the Golang E2E implementation is named e2e_test
and is entirely separate from all AgentBaker packages.
The e2e_test
package has a dependency on subpackage located in the scenario directory. Package scenario
is where all E2E scenarios are defined, each in their own separate files. This package also defines common types related to scenario and scenario configuration, as well as the hard-coded list of SIG version IDs located in images.go used for testing different OS distros. Package scenario
also contains the implementation of common cluster selectors and mutators within clusterconfiguration.go, though each scenario could define their own implementations if needed.
The primary testing function is located in suite_test.go, which is run by go test ...
.
The images.go file contains the hard-coded references to a set of delete-locked SIG versions used by the e2e scenarios.
If you decide to update some or all of these SIG versions, you need to make sure to add delete locks to each one via the Azure Portal so they don't get automatically deleted and eventually cause failuires
Minimally, each E2E scenario is parameterized with a set of "mutators" that change/set various properties of a base NodeBootstrappingConfiguration struct. This struct is then fed into GetLatestNodeBootstrapping to generate CSE and custom data. The most commonly mutated property of this struct across all scenarios is the OS distro. This is primarily because each scenario currently uses a separate VHD corresponding to the respective distro.
E2E scenarios can also be configured with VMSS configuration mutators that change/set properties on the VMSS model used to deploy the new VM to be bootstrapped. This is primarily useful when testing out different VM SKUs, especially for GPU-enabled scenarios which affect which code paths AgentBaker will use to generate CSE and custom data
Further, in order to support E2E scenarios which test different underlying AKS cluster configurations, such as the cluster's network plugin, each E2E scenario has its own "cluster selector" and "cluster mutator". Cluster selectors determine whether or not the given live AKS cluster is viable for running the given scenario, while cluster mutators will mutate a base AKS cluster model such that the model represents a cluster which is viable for running the given scenario. For example, a scenario meant to run on an AKS cluster configured with the kubenet network plugin would have a cluster selector which selects on the NetworkProfile.NetworkPlugin
property specifically for kubenet, while its cluster mutator would set this property to kubenet so a new cluster can be created for it to run on.
Lastly, E2E scenarios also consist of a list of live VM validators. Each live VM validator consists of a description, a bash command which will actually be run on the newly bootstrapped VM, and an "asserter" function that will perform assertions on the contents of both the stdout and stderr streams that result from the execution of the command. The validators can be used to assert on numerous types of properties of the live VM, such as the live file system and kernel state.
You can find all implemented scenarios in the scenario pacakge within files prefixed with scenario_
. The Scenario
struct definition can be found in scenario/types.go.
To implement a new scenario, you need to do the following:
- Create a new file in the scenario package directory named
scenario_<scenario-name>.go
- Within this new file, implement a private function with a representative name which returns a
*Scenario
representing the scenario's configuration - Add a call to the newly implemented function within the return value of the
scenarios()
function defined in scenarios/init.go - Implement any additional logic in the testing framework required by the new scenario
Each E2E scenario will generate its own logs after execution. Currently, these logs consist of:
cluster-provision.log
- CSE execution log, retrieved from/var/log/azure/aks/cluster-provision.log
(collected in success and CSE failure cases)kubelet.log
- the kubelet systemd unit's logs retrived by runningjournalctl -u kubelet
on the VM after bootstrapping has finished (collected in success and CSE failure cases)vmssId.txt
- a single line text file containing the unique resource ID of the VMSS created by the respective scenario, mainly collected for the purposes of posthoc resource deletion (collected in all cases where the VMSS is able to be created)
These logs will be uploaded in a bundle of the format:
└── scenario-logs
└── <scenario>
├── cluster-provision.log
├── kubelet.log
├── vmssId.txt
After a PR is created in AgentBaker's repo on GitHub, a pipeline calculating code coverage changes will automatically run.
We are utilizing coveralls to display the coverage report. The coverage report will be available in the PR's description. You can also view previous runs for the AgentBaker repo here.
We calculate code coverage for both unit tests and E2E tests.
To generate E2E coverage reports, we use code coverage changes introduced in Go 1.20.
Coverage report is generated by running AgentBaker's API server locally as a binary created with the -cover flag. E2E tests are then ran against that binary.
The following packages are used during calculation of coverage for E2E tests:
- github.com/Azure/agentbaker/apiserver
- github.com/Azure/agentbaker/cmd
- github.com/Azure/agentbaker/cmd/starter
- github.com/Azure/agentbaker/pkg/agent
- github.com/Azure/agentbaker/pkg/agent/datamodel
- github.com/Azure/agentbaker/pkg/templates
You can generate an E2E coverage report while running the E2E tests locally. To do so, follow the steps below:
- Build the AgentBaker server binary with -cover flag:
cd cmd
go build -cover -o baker -covermode count
GOCOVERDIR=covdatafiles ./baker start &
- Create directory for coverage report files
mkdir -p covdatafiles
- Run the binary
GOCOVERDIR=covdatafiles ./baker start &
- Run the E2E tests locally
/bin/bash e2e/e2e-local.sh
- Stop the binary - once the tests finish executing, you have to stop the binary with exit code 0 to generate the report. See the docs here.
kill $(pgrep baker)
- Display the coverage report within the terminal
go tool covdata percent -i=./cmd/somedata