Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add annotations for evidence on package locations #1723

Merged
merged 12 commits into from
Apr 13, 2023

Conversation

wagoodman
Copy link
Contributor

@wagoodman wagoodman commented Apr 7, 2023

Adds the concept of location annotations, allowing arbitrary key-value pairs to be listed onto source.Location objects. This PR also shows the first example of this with the dpkg cataloger:

[
  {
    "id": "3e9282034226b93f",
    "name": "adduser",
    "version": "3.118",
    "type": "deb",
    "foundBy": "dpkgdb-cataloger",
    "locations": [
      {
        "path": "/usr/share/doc/adduser/copyright",
        "layerID": "sha256:ec09eb83ea031896df916feb3a61cefba9facf449c8a55d88667927538dca2b4",
        "annotations": {
          "evidence": "supporting"
        }
      },
      {
        "path": "/var/lib/dpkg/info/adduser.conffiles",
        "layerID": "sha256:ec09eb83ea031896df916feb3a61cefba9facf449c8a55d88667927538dca2b4",
        "annotations": {
          "evidence": "supporting"
        }
      },
      {
        "path": "/var/lib/dpkg/info/adduser.md5sums",
        "layerID": "sha256:ec09eb83ea031896df916feb3a61cefba9facf449c8a55d88667927538dca2b4",
        "annotations": {
          "evidence": "supporting"
        }
      },
      {
        "path": "/var/lib/dpkg/status",
        "layerID": "sha256:ec09eb83ea031896df916feb3a61cefba9facf449c8a55d88667927538dca2b4",
        "annotations": {
          "evidence": "primary"
        }
      }
    ],
    "licenses": [
      "GPL-2"
    ],
...

Notes:

  • source.LocationSet requires that the existing source.Location is hashable as to enable it's use as a key in a map. This isn't possible since annotations are themselves maps and are directly attached to the Location struct. For this reason I've split out position information and metadata within the Location struct and added those as embeddings.
  • The source.LocationSet merges metadata for any two locations where the position information is the same, essentially merging the locations. This seems like the right call relative to the package struct, but this does not account nicely for duplicate keys with differing values, which will log a warning.

@wagoodman wagoodman requested a review from a team April 7, 2023 15:13
@wagoodman wagoodman added the enhancement New feature or request label Apr 7, 2023
@github-actions
Copy link

github-actions bot commented Apr 7, 2023

Benchmark Test Results

Benchmark results from the latest changes vs base branch
goos: linux%0Agoarch: amd64%0Apkg: github.com/anchore/syft/test/integration%0Acpu: Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz%0A                                                          │ ./.tmp/benchmark-7081b95.txt │%0A                                                          │            sec/op            │%0AImagePackageCatalogers/alpmdb-cataloger-2                                   11.53m ± 27%25%0AImagePackageCatalogers/ruby-gemspec-cataloger-2                             835.9µ ±  1%25%0AImagePackageCatalogers/python-package-cataloger-2                           3.006m ±  2%25%0AImagePackageCatalogers/php-composer-installed-cataloger-2                   681.1µ ±  2%25%0AImagePackageCatalogers/javascript-package-cataloger-2                       350.2µ ±  1%25%0AImagePackageCatalogers/dpkgdb-cataloger-2                                   497.5µ ±  1%25%0AImagePackageCatalogers/rpm-db-cataloger-2                                   478.6µ ±  1%25%0AImagePackageCatalogers/java-cataloger-2                                     11.02m ±  1%25%0AImagePackageCatalogers/graalvm-native-image-cataloger-2                     7.900µ ±  3%25%0AImagePackageCatalogers/apkdb-cataloger-2                                    544.5µ ±  1%25%0AImagePackageCatalogers/go-module-binary-cataloger-2                         17.94µ ±  2%25%0AImagePackageCatalogers/dotnet-deps-cataloger-2                              949.6µ ±  1%25%0AImagePackageCatalogers/portage-cataloger-2                                  333.0µ ±  1%25%0AImagePackageCatalogers/nix-store-cataloger-2                                225.7µ ±  2%25%0AImagePackageCatalogers/sbom-cataloger-2                                     107.8µ ±  1%25%0AImagePackageCatalogers/binary-cataloger-2                                   185.4µ ±  0%25%0Ageomean                                                                     440.3µ%0A%0A                                                          │ ./.tmp/benchmark-7081b95.txt │%0A                                                          │             B/op             │%0AImagePackageCatalogers/alpmdb-cataloger-2                                   5.063Mi ± 0%25%0AImagePackageCatalogers/ruby-gemspec-cataloger-2                             124.7Ki ± 0%25%0AImagePackageCatalogers/python-package-cataloger-2                           950.6Ki ± 0%25%0AImagePackageCatalogers/php-composer-installed-cataloger-2                   156.4Ki ± 0%25%0AImagePackageCatalogers/javascript-package-cataloger-2                       91.34Ki ± 0%25%0AImagePackageCatalogers/dpkgdb-cataloger-2                                   145.9Ki ± 0%25%0AImagePackageCatalogers/rpm-db-cataloger-2                                   170.8Ki ± 0%25%0AImagePackageCatalogers/java-cataloger-2                                     2.758Mi ± 0%25%0AImagePackageCatalogers/graalvm-native-image-cataloger-2                     1.555Ki ± 0%25%0AImagePackageCatalogers/apkdb-cataloger-2                                    129.7Ki ± 0%25%0AImagePackageCatalogers/go-module-binary-cataloger-2                         3.133Ki ± 0%25%0AImagePackageCatalogers/dotnet-deps-cataloger-2                              315.6Ki ± 0%25%0AImagePackageCatalogers/portage-cataloger-2                                  78.32Ki ± 0%25%0AImagePackageCatalogers/nix-store-cataloger-2                                40.72Ki ± 0%25%0AImagePackageCatalogers/sbom-cataloger-2                                     13.58Ki ± 0%25%0AImagePackageCatalogers/binary-cataloger-2                                   29.94Ki ± 0%25%0Ageomean                                                                     103.0Ki%0A%0A                                                          │ ./.tmp/benchmark-7081b95.txt │%0A                                                          │          allocs/op           │%0AImagePackageCatalogers/alpmdb-cataloger-2                                    86.72k ± 0%25%0AImagePackageCatalogers/ruby-gemspec-cataloger-2                              2.053k ± 0%25%0AImagePackageCatalogers/python-package-cataloger-2                            15.50k ± 0%25%0AImagePackageCatalogers/php-composer-installed-cataloger-2                    3.460k ± 0%25%0AImagePackageCatalogers/javascript-package-cataloger-2                        1.207k ± 0%25%0AImagePackageCatalogers/dpkgdb-cataloger-2                                    2.652k ± 0%25%0AImagePackageCatalogers/rpm-db-cataloger-2                                    3.761k ± 0%25%0AImagePackageCatalogers/java-cataloger-2                                      38.98k ± 0%25%0AImagePackageCatalogers/graalvm-native-image-cataloger-2                       40.00 ± 0%25%0AImagePackageCatalogers/apkdb-cataloger-2                                     3.439k ± 0%25%0AImagePackageCatalogers/go-module-binary-cataloger-2                           101.0 ± 0%25%0AImagePackageCatalogers/dotnet-deps-cataloger-2                               5.013k ± 0%25%0AImagePackageCatalogers/portage-cataloger-2                                   1.545k ± 0%25%0AImagePackageCatalogers/nix-store-cataloger-2                                  762.0 ± 0%25%0AImagePackageCatalogers/sbom-cataloger-2                                       392.0 ± 0%25%0AImagePackageCatalogers/binary-cataloger-2                                     872.0 ± 0%25%0Ageomean                                                                      2.083k

Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
@wagoodman wagoodman changed the title Add location annotations Add annotations for evidence on package locations Apr 12, 2023
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
@wagoodman wagoodman force-pushed the add-location-annotations branch from 7b3870b to bf30d08 Compare April 12, 2023 20:20
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
@wagoodman
Copy link
Contributor Author

JSON schema diff for reviewers:

# ❯ diff schema/json/schema-7.1.1.json schema/json/schema-7.1.2.json
817,818c817,823
<         "virtualPath": {
<           "type": "string"
---
>         "annotations": {
>           "patternProperties": {
>             ".*": {
>               "type": "string"
>             }
>           },
>           "type": "object"
946c951
<             "$ref": "#/$defs/Coordinates"
---
>             "$ref": "#/$defs/Location"

@wagoodman wagoodman marked this pull request as ready for review April 12, 2023 20:26
Copy link
Contributor

@kzantow kzantow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall 👍 just a few comments before ✅

"github.com/anchore/syft/syft/source"
)

func newPackage(classifier classifier, location source.Location, matchMetadata map[string]string) []pkg.Package {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should continue to be named singlePackage, as it's returning a slice, not just a package. If it was returning a pkg.Package, definitely named newPackage but I wanted it to be very clear it wasn't returning more than one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right this is a little confusing. I was trying to bring the package constructor naming and file organization in line with the other catalogers. I think having something named singlePackage and return []pkg.Package is confusing and seems to be more of a convenience for the caller. I think to make this more clear on all fronts changing the signature to newPackage(...) *pkg.Package makes the most sense, and the caller would optionally return the slice result to fulfill it obligations as a parser function.

syft/pkg/cataloger/elixir/parse_mix_lock_test.go Outdated Show resolved Hide resolved
syft/pkg/cataloger/erlang/parse_rebar_lock_test.go Outdated Show resolved Hide resolved
Coordinates `cyclonedx:""` // Empty string here means there is no intermediate property name, e.g. syft:locations:0:path without "coordinates"
// note: it is IMPORTANT to ignore anything but the coordinates for a Location when considering the ID (hash value)
// since the coordinates are the minimally correct ID for a location (symlinks should not come into play)
VirtualPath string `hash:"ignore" json:"virtualPath,omitempty"` // The path to the file which may or may not have hardlinks / symlinks
ref file.Reference `hash:"ignore"` // The file reference relative to the stereoscope.FileCatalog that has more information about this location.
VirtualPath string `hash:"ignore" json:"-"` // The path to the file which may or may not have hardlinks / symlinks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we excluding VirtualPath from JSON output?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that the source.Location is used instead of source.Coordinates in the json model, I updated the json struct tags to reflect the same data shape that is supported by the current schema. Coordinates don't convey the virtual path https://github.com/anchore/syft/blob/v0.77.0/syft/formats/syftjson/model/package.go#L29 so to be consistent this struct tag was changed to -.

We could change this, but I'd recommend that in a follow up PR.

syft/source/location.go Outdated Show resolved Hide resolved
syft/source/location_set.go Outdated Show resolved Hide resolved
syft/source/location_set.go Outdated Show resolved Hide resolved
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
@wagoodman wagoodman merged commit 5d156b8 into main Apr 13, 2023
@wagoodman wagoodman deleted the add-location-annotations branch April 13, 2023 21:02
@spiffcs
Copy link
Contributor

spiffcs commented Apr 13, 2023

This one looks good to me - no other comments to add besides what already exists on the PR. I've started incorporating the location changes into the license evidence PR so 👍 thanks for making this clear and easy to follow

spiffcs added a commit that referenced this pull request Apr 17, 2023
* main: (35 commits)
  Fix kernel cataloger test fixtures (#1742)
  feat: Support scanning license files in golang packages over the network (#1630)
  Add package-to-file location evidence relationships (#1698)
  Add Linux Kernel cataloger (#1694)
  Add annotations for evidence on package locations (#1723)
  add format make target (#1733)
  Update tests to not fail on Mac M1's. (#1730)
  chore(deps): update bootstrap tools to latest versions (#1728)
  Add support for nar files. (#1727)
  add highlevel details about catalogers (#1726)
  chore(deps): bump golang.org/x/net from 0.8.0 to 0.9.0 (#1722)
  chore(deps): update stereoscope to e95d60a265e384df29b7a139f5c5402d6ad72e06 (#1721)
  feat: gradle lockfile support (#1719)
  chore(deps): bump github.com/docker/docker (#1715)
  chore(deps): bump golang.org/x/mod from 0.9.0 to 0.10.0 (#1713)
  chore(deps): bump golang.org/x/term from 0.6.0 to 0.7.0 (#1714)
  chore(deps): bump github.com/spf13/cobra from 1.6.1 to 1.7.0 (#1716)
  chore(deps): bump peter-evans/create-pull-request from 4 to 5 (#1712)
  chore: update tools-golang to v0.5.0 (#1717)
  Add Nix cataloger (#1696)
  ...

Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
GijsCalis pushed a commit to GijsCalis/syft that referenced this pull request Feb 19, 2024
* add location annotations + deb evidence annotations

Signed-off-by: Alex Goodman <alex.goodman@anchore.com>

* rename LocationData struct and Annotation helper function

Signed-off-by: Alex Goodman <alex.goodman@anchore.com>

* add failing integration test for evidence coverage

Signed-off-by: Alex Goodman <alex.goodman@anchore.com>

* add evidence to aplm cataloger locations

Signed-off-by: Alex Goodman <alex.goodman@anchore.com>

* change location annotation helper to return a location copy

Signed-off-by: Alex Goodman <alex.goodman@anchore.com>

* add evidence to binary cataloger locations

Signed-off-by: Alex Goodman <alex.goodman@anchore.com>

* updated remaining catalogers with location annotations

Signed-off-by: Alex Goodman <alex.goodman@anchore.com>

* fix unit tests

Signed-off-by: Alex Goodman <alex.goodman@anchore.com>

* fix linting

Signed-off-by: Alex Goodman <alex.goodman@anchore.com>

* bump json schema

Signed-off-by: Alex Goodman <alex.goodman@anchore.com>

* partial addressing of review comments

Signed-off-by: Alex Goodman <alex.goodman@anchore.com>

* rename location.WithAnnotation

Signed-off-by: Alex Goodman <alex.goodman@anchore.com>

---------

Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants