Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

feat(forge): add mutation test #10193

Draft
wants to merge 74 commits into
base: master
Choose a base branch
from

Conversation

simon-something
Copy link

This is a draft PR for visibility/ request for design and implementation feedback.

As discussed in #478 , this is an implementation of mutation tests for foundry unit tests (I'd suggest closing the other PR which uses Gambit)

Motivation

While coverage explains which lines are executed when running a test suite, mutation tests add another level of security by assessing which lines are tested. To do so, mutations (ie random or constrained changes to the codebase) are introduced, the codebase is then recompiled (only keeping the successful compilation outcome) and the test run against this "mutated codebase". Ideally, the tests should fail against any mutant (ie not a single part of the code can be changed without getting caught by at least one test).

Solution

This implementation started from the existing Certora's Gambit, but diverged (but kept some of the mutation naming). It leverages Solar to create an ast, find mutation and conduct them, in an efficient manner.

The flow is as follow:

  • triggered by --mutate flag of forge test
  • if a list of contracts is passed after mutate, these are the targets mutated
  • if no contract is provided, list all contracts in src/
  • for now, mutate isn't excluding test filters (still not sure if it shouldn't be the case tho)
  • run all tests, using cache (see below) - if one test fails, interrupt here
  • do the following for each contract to mutate:
    • lex and parse it as an ast (using Solar)
    • visit this ast, for each expression, generate each possible mutation (eg x = 4 will be mutated as x = 0 and x = -4)
    • for each mutant:
      • create a temp dir
      • copy the cache, out, test and src (excluding the target contract)
      • emit the solidity code of the mutant in src
      • (try to) compile it, using the cache and out (this uses foundry-compiler, to be "compiler-agnostic" if/when Solar is used, even tho it should then be refactored to avoid writing solidity on the disk/starts from the mutated ast instead)
      • if successfully compiling, run the tests and get the outcome
  • collect and display all outcomes, in terms of invalid, dead, surviving mutant <-- current status (for now, it's just some print)

The mutations are stored in a mutator module, to allow easily adding/modifying them (even if it is kept as temporary design, this will help designing them imo).

Beside the (many) todo's in the code, some things to work on are:
tests+++, doc, squash, etc + test shouldn't be copied in the temp folder (update filter instead? I tried unsuccessfully before), add a way to resume a mutation campaign, implement non-duplicate detection (trivial cases at least:compare bytecode hash?) - see the original issue, review other mutation strategies (vertigo and univeralmutator?) or poke the one not implemented by Gambit (https://github.com/Certora/gambit/blob/bf7ab3c91c47a10dcf272380b6406f0404f3b5d1/src/mutation.rs#L323 for instance)

PR Checklist

  • Added Tests
  • Added Documentation
  • Breaking changes

@simon-something
Copy link
Author

simon-something commented Mar 28, 2025

@grandizzy new PR just dropped!

Todo list at the current commit:

  • make mutator more modular:
    -- create a more generic test for mutators
    -- config to add/exclude mutators (ie keeping the longest ones for CI for instance), I guess a list in the toml would do it
    -> Some mutators which are missing but would be nice to have: inlined Yul mutations, swap lines for external calls (from universalmutator)

  • general metrics:
    -- "standard" reporting (ie (in)valid, dead, survivors and the mutation sites)
    -- would be interesting to collect and output the "mutation coverage" too (ie which part of the src has or has not been mutated)

@chandrakananandi
Copy link

Hi @simon-something sorry for my late response, I have been meaning to get back to you for some weeks but was swamped with other things. I am one of the developers of Gambit.

I am super stoked by this work and also the work that Sam Parsky did in the past!

You mentioned that you started with Gambit but then deviated from it. Could you share what led to that? Maybe some of your work can be integrated with Gambit? I would happily merge any PRs!

I would also be curious to know if you have suggestions on how to improve Gambit. I already know that one dimension in which Gambit can be improved is the quality of mutants. But are there other aspects that could benefit based on your usage? Maybe make it easier to integrate with Foundry?

I noticed you mentioned that you used Solar. We, in the past tried to use this library: https://crates.io/crates/solang-parser (you can see some of the work done by Ben Kushigian here: Certora/gambit#30) but I recall that it was not super stable at the time. How has your experience been using Solar?

@grandizzy
Copy link
Collaborator

@grandizzy new PR just dropped!

thank you! checking, will provide feedback

@simon-something
Copy link
Author

simon-something commented Apr 2, 2025

Hi @chandrakananandi ! No worries, my initial question was mostly "is this still being worked one", and I could get in touch with Same so all good!

You mentioned that you started with Gambit but then deviated from it. Could you share what led to that? Maybe some of your work can be integrated with Gambit? I would happily merge any PRs!

It's mostly a matter of foundry integration: I'm reusing previous build to speed up compile time/incrementally compile only (so, iic, "bare" solc wouldn't handle buildId's), and I'm using Solar's ast types/lexing/parsing and visitor to be ready to compile with it (ideally, mutated ast wouldn't be written in a temp folder anymore) -> I think building it "inside" foundry was easier!

I would also be curious to know if you have suggestions on how to improve Gambit. I already know that one dimension in which Gambit can be improved is the quality of mutants. But are there other aspects that could benefit based on your usage? Maybe make it easier to integrate with Foundry?

Maybe something which could be great for Gambit too (and another reason why I didn't reuse it) is having more flexibility in mutations -> even though mutation types aren't really meant to evolve a lot, having an easy way to add/remove these are great to build/study them imo. By this, I mean adding new mutation to the repo or have custom list of mutator to use in a given run (still need to implement the latest). I noticed some mutation types were commented out/not implemented in Gambit (for instance the "swap lines of an external call/reentrancy-like" from universalmutator), having more flexibility might help (I personally use a mutator trait https://github.com/simon-something/foundry/blob/c0ffeec20a10ce26638c344f1a68bc925ba5a46d/crates/forge/src/mutation/mutators/mod.rs#L15 and tried really hard to keep test welcoming for anyone to play around mutation https://github.com/simon-something/foundry/tree/master/crates/forge/src/mutation/mutators/tests). Maybe this is something worth considering for Gambit (as it simplify playing with mutators)? Down to help (personally/@defi-wonderland has no free bandwidth for now;)!

I share the todo "improve mutant quality" too (same as hunting duplicate) :D

I noticed you mentioned that you used Solar. We, in the past tried to use this library: https://crates.io/crates/solang-parser (you can see some of the work done by Ben Kushigian here: Certora/gambit#30) but I recall that it was not super stable at the time. How has your experience been using Solar?

Solar is really nice to integrate (at least for lexing/parsing/visiting) - iirc, solang uses some kind of query? Solar on the other hand has a macro to generate visitor, so straightforward to add logic there. Did you consider integrating it in Gambit?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

4 participants