Add CI test jobs for other operating systems #38

EliahKagan · 2023-08-02T00:51:39Z

This expands the CI test matrix so that in addition to testing all three Python versions we support on Ubuntu, it also tests them on macOS and on Windows. This expands the number of jobs from three to nine.

I strongly believe it is worthwhile to test on multiple operating systems. However, some new considerations arise when doing so. I am aware of three cases where I made choices that are not obvious and that it may be reasonable to make it a different way:

Artifact names and the `os` variable

The output artifact names are expanded accordingly, so that they do not clash with each other. A less important corresponding change is also made to the artifact name in the publishing workflow for consistency (just as it already had 3.9 in its name before, for consistency).

It would be more common and idiomatic to have the os matrix variable take on full names like ubuntu-latest rather than ubuntu. However, examining output would be cumbersome if latest appeared in all the artifact filenames. Using the shorter names in os and specifying the OS as ${{ matrix.os }}-latest avoids that without extra complexity.

Poetry installation

This changes how the test workflow installs Poetry, using pip instead of the install script.

As detailed in 392c36f, this works on all three OSes, without requiring the workflow to become brittle (e.g., by hard-coding paths) or significantly more complicated, and in a way that decisively avoids accidentally installing and testing the project with the same version of Python in each job (as happens with pipx on Windows GitHub Actions runners unless one is very careful).

None of these problems applied when we were only testing on Ubuntu, so the publishing workflow does not need to be changed. Installing poetry with pip is not ideal but I believe these considerations justify it.

Fail-fast

The default behavior of a GitHub Actions workflow with a matrix strategy is to fail fast: when any generated job fails, the other jobs generated from the same matrix (that were triggered by the same event) are canceled automatically.

I have kept this default. Having more jobs is what the default is meant for. CI is rate-limited, and when pushing many commits while code under test is temporarily not working--typically to a feature branch, including in the case of an unfinished PR--it is useful to avoid running extra test jobs. Test jobs are heavier than some other kinds of jobs because they have to install dependencies.

However, it is possible to have a situation that is broken for a specific combination of OS and Python version, where it is useful to see complete test results from the other jobs. It is also possible to make a major change and want to know everything it breaks. So failing fast has some disadvantages. But I think, overall, it makes sense. This can always be changed later, of course.

This expands the CI test matrix so that in addition to testing all three Python versions we support on Ubuntu, it also tests them on macOS and on Windows. This expands the number of jobs from three to nine. The output artifact names are expanded accordingly, so that they do not clash with each other. A less importand corresponding change is also made to the artifact name in the publishing workflow for consistency (just as it already had `3.9` in its name before, for consistency). It would be more common and idiomatic to have the `os` matrix variable take on full names like `ubuntu-latest` rather than `ubuntu`. However, examining output would be cumbersome if `latest` appeared in all the artifact filenames. Using the shorter names in `os` and specifying the OS as `${{ matrix.os }}-latest` avoids that without extra complexity. The setup-python action on Windows doesn't (or doesn't always) supply a `python3` executable, but it supplies a `python` executable on all OSes (at least for CPython, as we are using). So this changes the poetry installation command in the test workflow so it pipes the downloaded script to `python` instead of `python3`.

This changes how `poetry` is obtained in the test workflow, because on macOS the Poetry installation script puts it in a directory where it is not found. It's possible to hard-code or obtain the location and add it to `$PATH`, but this would make the workflow more complicated and harder to understand, or be brittle, or both. Installing it with `pip` in the global environment makes it work on all OSes without further configuration. It's moderately undesirable to do this, because generally it's better to avoid putting things in the global environment. However, because we are using a virtual environment to install the project and run the tests, sufficient isolation is still maintained in this case. The reason this uses pip instead of pipx, even though pipx is usually the ideal solution to such problems, is that pipx on a GitHub CI runner defaults to using the system Python (which it was installed with) even when setup-python has been used to get a different version. The effect is to install a project and run its tests in a way that looks good but actually uses the same version of Python in every generated job, unless one is extremely careful about how one runs pipx and/or poetry (which itself may increase workflow complexity). Although using `pip install poetry` may be justified for the above reasons, it is not ideal, so I have not made that change in the publishing workflow. Perhaps a better way that is not excessively complicated can be found in the future and used in both places.

Having the intended version in the name made sense before, but it is less useful now. - It is intuitive to look at a version number in a step name in the GitHub Actions web interface and assume that this guarantees that the version is really being used to run the tests. Some kinds of bugs can keep that from happening on Windows (detailed in 392c36f). In its current state, the test workflow does not suffer from any such bug, but if future modifications ever instroduce such a bug on a feature branch, having less potentially misleading restatement of the Python version can make it easier to notice the problem. The output of `pytest` always includes, near the top, an accurate statement of the Python version used to run `pytest`. - A separate reason for this change is that there are now two pieces of information that vary across generated jobs. Because one needs to know the operating system as well as the Python version, the slight added convenience of having the Python version in the step name is less significant. This change does not affect other parts of the web interface. The intended Python version number for the job is still given, together with the operating system, in the job name's parenthesized suffix.

EliahKagan added 3 commits August 2, 2023 00:38

EliahKagan force-pushed the os-jobs branch from a2828b0 to ac74ea2 Compare August 2, 2023 01:45

EliahKagan marked this pull request as ready for review August 2, 2023 01:53

EliahKagan mentioned this pull request Aug 2, 2023

Shorten workflow filenames #39

Merged

bazingagin merged commit b05a7bb into bazingagin:main Aug 3, 2023

EliahKagan deleted the os-jobs branch August 3, 2023 16:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CI test jobs for other operating systems #38

Add CI test jobs for other operating systems #38

EliahKagan commented Aug 2, 2023 •

edited

Loading

Add CI test jobs for other operating systems #38

Add CI test jobs for other operating systems #38

Conversation

EliahKagan commented Aug 2, 2023 • edited Loading

Artifact names and the os variable

Poetry installation

Fail-fast

EliahKagan commented Aug 2, 2023 •

edited

Loading

Artifact names and the `os` variable