Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Pytest coverage #417

Merged
merged 7 commits into from
Jun 16, 2020
Merged

Pytest coverage #417

merged 7 commits into from
Jun 16, 2020

Conversation

saroad2
Copy link
Member

@saroad2 saroad2 commented Jun 7, 2020

Added pytest coverage to CI/CD process.

@saroad2 saroad2 requested a review from freakboy3742 June 7, 2020 21:00
Copy link
Member

@freakboy3742 freakboy3742 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So - this all looks relatively straightforward in terms of what it is doing. My question is whether, in practice, it actually gives us any additional protection.

You've got this configured to fail at < 90% coverage. I don't know what our current testing level actually is - but regardless, over time, one of two things will need to happen:

  • We end up requiring every contributor to update .coveragerc to keep the failure level just under the current coverage level. This is going to make merging PRs a nightmare, because 2 PRs that add code are going to affect the failure level in different ways. A PR that adds 10 lines of code is going to affect the overall percentage more than a PR that adds 100 lines of code. Merge PR1 before PR2, and the percentages will be all off, and we'll end up having to do complex merges just to keep the failure rate happy.

  • We keep the failure point "comfortably below" the current testing level. But the, at some point, someone is going to submit a PR that breaks that testing condition, and they're going to wear the full cost of the last N PRs that didn't have adequate testing.

Ok - so maybe we turn off the hard-coded failure percentage. At that point, we have coverage being computed - but it's not being visualized anywhere that will be seen. So it ends up being an artefact that is generated, but isn't actually looked at by anyone, because it only appears in the build logs... that you never look at unless there's a failure.

And this is ultimately why I'm skeptical about using coverage in CI: Unless you have either 100% code coverage, it's not an especially effective mechanism for ensuring code quality. Anything less than 100% means you have to either have an ongoing ratchet to make sure coverage doesn't drop, and you have to make sure people aren't gaming the coverage math.

That's before we get to the question of whether line coverage (as opposed to branch coverage) is sufficient to ensure code quality. 100% line coverage is a necessary, but not sufficient property of adequate testing.

Don't get me wrong - coverage can be a really useful tool for checking if certain features are being tested at all. I'm just not sure the configuration you've described here is any more effective than a good code review process that rejects PRs that don't have adequate tests of new or modified features as part of the PR.

I'd be more inclined to support this if the coverage process provided feedback (even if it was purely advisory in the form of an automated PR comment) about lines that are added/modified by the PR that aren't in the coverage set. Do you have any thoughts about how we might be able to do that sort of integration?

@saroad2
Copy link
Member Author

saroad2 commented Jun 8, 2020

You can easily check the current coverage by taking a look at the Smoke test stage. Just click on it, go to the Test step and you'll see that for this PR, the current coverage output is:

...
src/briefcase/platforms/linux/appimage.py     106      4    96%   56-57, 256-257
src/briefcase/platforms/linux/deb.py            0      0   100%
src/briefcase/platforms/linux/flatpak.py        0      0   100%
src/briefcase/platforms/linux/rpm.py            0      0   100%
src/briefcase/platforms/linux/snap.py           0      0   100%
src/briefcase/platforms/macOS/app.py           87     10    89%   65-66, 82-94, 132, 169-171
src/briefcase/platforms/macOS/dmg.py           65     34    48%   25, 28-30, 39, 60-64, 72-126
src/briefcase/platforms/macOS/homebrew.py       0      0   100%
src/briefcase/platforms/tvOS/xcode.py           0      0   100%
src/briefcase/platforms/watchOS/xcode.py        0      0   100%
src/briefcase/platforms/wearos/gradle.py        0      0   100%
src/briefcase/platforms/windows/msi.py        101     28    72%   32-67, 130-145, 180-181, 220-221, 241-242, 262-264
-------------------------------------------------------------------------
Required test coverage of 90.0% reached. Total coverage: 92.26%

As you can see, it gives you the total coverage and the actual lines which are not covered.
It's true that it is not as pretty as a PR comment, but it is doable, especially for maintainers like you and me. When we as maintainers come to review a PR, we should open up the Smoke Test stage and see what is the coverage status after the PR.

Another thing we could use is CodeCov and specifically CodeCov-Action. I never used it, but it seems like a very nice tool. I would give it another look to see how it can be used. Pay attention that if we want to use it for more than 5 users it will cost us money.

In regards to that, the way I see it, there are 2 paths we could take from here:

  • Create a new PR that adds the missing tests to Briefcase. It might take a while, but in the end, we'll get to 100% coverage. When we get there we could set the fail_under to 100% and then every new PR will have to add tests to cover the lines changed. I truly believe this is the right path, but I understand that not everybody thinks the same way about it as I do.
  • Decide a minimum coverage for a release (for example, 90% as I set it in this PR) and add the fail_under property only to the release Github action. In that way, coverage won't fail any PR of any contributor, but it will fail us if we try to deploy a version that is not tested enough. In that way, we constraint ourselves instead of contributors. I think this is the less optimal choice, but I prefer it over nothing.

What do you think? Do you think we have a 3rd option that I missed? Please let me know and I'll be happy to take us there.

@freakboy3742
Copy link
Member

I know that you can see the coverage by digging into the logs; my point is that unless there is actually a reason to look there, nobody is going to look there.

I agree that 100% coverage is the most desirable outcome; I guess I don't have a good feel for how far off 100% coverage we actually are. My guess would be that we're pretty close - the current codebase is fairly aggressively tested; but my concern is that getting the last 1-2% could prove prohibitively difficult. If it can be done, then turning on "must have 100% coverage" as a failure condition is a viable option, because every new line of code should be tested.

Making coverage a requirement for releases isn't a great option to my mind. A release should be something you can do at any time. If your day-to-day processes don't leave the code in a releasable state, then releasing becomes a cumbersome activity.

As for other options? I can't say I can think of any good ones.

So - my inclination would be to say that 100% coverage is what we should be aiming for; and to treat "the missing 10% of tests" as the pre-requisite for merging this patch and committing to 100% coverage going forward.

@saroad2
Copy link
Member Author

saroad2 commented Jun 11, 2020

So, how do we want to proceed?
Do you want us to merge this PR and starting filling the missing 9% of coverage in the next commits, or do you want us to get to 100% on this PR and then add the 100% test coverage to the CI process?

By the way, I'm trying to use CodeCov in my other project called Statue. I'll let you know how I feel about it after I'm done.

@freakboy3742
Copy link
Member

My preference would be to get the remaining 9%, and then enable CI coverage. Essentially if you were to turn on "fail on any missing coverage" to this PR, we can merge it when CI passes.

@saroad2
Copy link
Member Author

saroad2 commented Jun 14, 2020

So I did some more digging about CodeCov, and it seems like it does exactly what we want:

  • It is built upon python coverage and has very good integration with Github Actions and Github in general.
  • It plots a very nice coverage summary as a comment on the PR (For example, Add run self of test step saroad2/statue#4)
  • It fails the PR if the coverage percentage was reduced in the current PR (which means we can get rid of the fail_under property in .coveragerc)
  • It is configurable using codecov.yml file (look here for more details)
  • It is free for open-source projects

@freakboy3742 , Because I'm not the owner of the repo, only you can set up CodeCov for this repo. What do you think? I'm using it in my own repo and it works beautifully!

@codecov
Copy link

codecov bot commented Jun 16, 2020

Codecov Report

❗ No coverage uploaded for pull request base (master@cd73b84). Click here to learn what that means.
The diff coverage is n/a.

Copy link
Member

@freakboy3742 freakboy3742 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok; with codecov.io it looks like we get what we need. May need to fine tune the reporting format, but for now this will do.

@freakboy3742 freakboy3742 merged commit f27b2fc into beeware:master Jun 16, 2020
@saroad2 saroad2 deleted the pytest_coverage branch June 16, 2020 09:38
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants