Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Want to capture similar labels that don't affect statistics negatively #131

Open
wagoodman opened this issue Sep 13, 2023 · 0 comments
Open
Labels
enhancement New feature or request

Comments

@wagoodman
Copy link
Contributor

Today you can find the same match more than one way, when labeling these matches there tends to be one match that is "most correct" and others that "are technically correct". For instance, what if in one version of grype we don't consider RPM epoch but in a future version we do consider the epoch... the label for the match with the epoch included is more correct, but that doesn't mean that the original label without the epoch is wrong.

The problem with this situation is that without choosing to label the match as either TP or FP then quality gates will start failing over time due to a high number of indeterminate matches. An indeterminate match is a match that is either:

  • unlabeled
  • has been explicitly labeled as unclear
  • has multiple labels that conflict (TP and FP labels for example)

There are a couple paths forward here:

  1. don't count unclear labels as indeterminate. The advantage is that gating can still pass, but if the behavior regresses and the pairing TP match is not made (but the unclear match is) then the F1 score goes down, which is ultimately a good behavior.
  2. add the concept of a TP alias label, where the label can stand in as a TP for another label, but if it's missing it won't count as a FN. This has the advantage of allowing the F1 score to remain unchanged in either matching scenario (even if a match is "better" or "not as good" if at least one is found then it does not matter, nothing will be counted against you).
@wagoodman wagoodman added the enhancement New feature or request label Sep 13, 2023
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant