Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

SyftCLIScanner: support SBOM generation with syft CLI #602

Merged
merged 1 commit into from
Sep 17, 2024

Conversation

arjun024
Copy link
Member

@arjun024 arjun024 commented Sep 14, 2024

Packit currently supports SBOM generation with syft tooling by utilizing syft's go library. This has caused packit maintainers significant maintainence burden. This commit adds a mechanism for buildpack authors to utlize the syft CLI instead to generate SBOM. The intention here is that with widespread adoption of this, we can phase out the codebase that uses the syft go libary and thereby relieve the maintainers of this pain.

Until recently, syft did not allow consumers to specify the exact schema version of an SBOM mediatype they want generated (the tooling currently supports passing a version for CycloneDX and SPDX - anchore/syft#846 (comment)). So packit was forced to vendor-in (copy) large chunks of upstream syft go code into packit in order to pin SBOM mediatype versions to versions that most consumers wanted to use. Everytime a new version of Syft comes out, maintainers had to painfully update the vendored-in code to work with upstream syft components (e.g. #491).

Furthermore, it is advantageous to use the syft CLI instead of syft go library for multiple reasons. With CLI, we can delegate the entire SBOM generation mechanism easily to syft. It should help buildpacks avoid any CVEs that are exposed to it via syft go libaries. The CLI tool is well documented and widely used in the community, and it seems like the syft project is developed with with a CLI-first approach. The caveat here is that buildpack authors who use this method should include the Paketo Syft buildpack in their buildplan to have access to the CLI during the build phase.

Example usage:

# detect
# unless BP_DISABLE_BOM is true
requirements = append(requirements, packit.BuildPlanRequirement{
                Name: "syft",
                Metadata: map[string]interface{}{
                        "build": true,
                },
})

# build
syftCLIScanner := sbomgen.NewSyftCLIScanner(
		pexec.NewExecutable("syft"),
		scribe.NewEmitter(os.Stdout),
)

# To scan a layer after installing a dependency
_ = syftCLIScanner.GenerateSBOM(myLayer.Path,
	context.Layers.Path,
	myLayer.Name,
	context.BuildpackInfo.SBOMFormats...,
)

# OR to scan the workspace dir after running a process
_ = syftCLIScanner.GenerateSBOM(context.WorkingDir,
	context.Layers.Path,
	myLayer.Name,
	context.BuildpackInfo.SBOMFormats...,
)
  • A new package sbomgen is created instead of adding the functionality to the existing sbom package because it helps buildpacks remove pinned anchore/syft lib from their go.mod which were flagged down by CVE scanners.
  • I have not implemented pretty-fication of SBOM that the codepath that use syft go lib implements. This seems to be adding bloat to the app image and not supported via CLI. Consumers of SBOM can easily prettify the SBOM JSONs.
  • In the codepath that use the syft go lib, license information is manually injected from buildpack.toml data into the SBOM. This is not available with the SyftCLIScanner. I couldn't find any reasoning for why this was done in the first place.
  • I have intentionally not reused some code in methods that's mixed up with the syft go library with an intention to easily phase out that codebase in the near future. If/when we decide to remove code using syft go lib, the entire sbom package can be removed.

Checklist

  • I have viewed, signed, and submitted the Contributor License Agreement.
  • I have linked issue(s) that this PR should close using keywords or the Github UI (See docs)
  • I have added an integration test, if necessary.
  • I have reviewed the styleguide for guidance on my code quality.
  • I'm happy with the commit history on this PR (I have rebased/squashed as needed).

@arjun024 arjun024 requested a review from a team as a code owner September 14, 2024 09:14
@arjun024 arjun024 added the semver:minor A change requiring a minor version bump label Sep 14, 2024
@arjun024 arjun024 force-pushed the sbom-gen-with-syft-cli branch from 0ec15df to 67038cc Compare September 14, 2024 09:18
@arjun024 arjun024 marked this pull request as draft September 14, 2024 11:18
@loewenstein
Copy link

loewenstein commented Sep 14, 2024

Without looking into the details, but I think the libpak based buildpacks use https://github.com/paketo-buildpacks/syft and the build plan to get a sift cli to generate SBOMs. Wouldn't this be a missed alignment opportunity between packit and libpak to not at least consider if this could be an alternative path to reach the same goal - i.e. liberating packet maintainers from managing the Syft Go package versions and coding around calling it?

cc @paketo-buildpacks/steering-committee

P.S. Or is that the plan?

Packit currently supports SBOM generation with syft tooling by utilizing
syft's go library. This has caused packit maintainers significant
maintainence burden. This commit adds a mechanism for buildpack authors
to utlize the syft CLI instead to generate SBOM. The intention here is
that with widespread adoption of this, we can phase out the codebase
that uses the syft go libary and thereby relieve the maintainers of this
pain.

Until recently, syft did not allow consumers to specify the exact schema
version of an SBOM mediatype they want generated (the tooling currently
supports passing a version for CycloneDX and SPDX -
github.com/anchore/syft/issues/846#issuecomment-1908676454). So packit
was forced to vendor-in (copy) large chunks of upstream syft go code
into packit in order to pin SBOM mediatype versions to versions that
most consumers wanted to use. Everytime a new version of Syft comes out,
maintainers had to painfully update the vendored-in code to work with
upstream syft components (e.g.
github.com//pull/491).

Furthermore, it is advantageous to use the syft CLI instead of syft go
library for multiple reasons. With CLI, we can delegate the entire SBOM
generation mechanism easily to syft. It should help buildpacks avoid any
CVEs that are exposed to it via syft go libaries. The CLI tool is well
documented and widely used in the community, and it seems like the syft
project is developed with with a CLI-first approach. The caveat here is
that buildpack authors who use this method should include the Paketo
Syft buildpack in their buildplan to have access to the CLI during the
build phase.

Example usage:

\# detect
\# unless BP_DISABLE_BOM is true
requirements = append(requirements, packit.BuildPlanRequirement{
                Name: "syft",
                Metadata: map[string]interface{}{
                        "build": true,
                },
})

\# build
syftCLIScanner := sbomgen.NewSyftCLIScanner(
		pexec.NewExecutable("syft"),
		scribe.NewEmitter(os.Stdout),
)

\# To scan a layer after installing a dependency
_ = syftCLIScanner.GenerateSBOM(myLayer.Path,
	context.Layers.Path,
	myLayer.Name,
	context.BuildpackInfo.SBOMFormats...,
)

\# OR to scan the workspace dir after running a process
_ = syftCLIScanner.GenerateSBOM(context.WorkingDir,
	context.Layers.Path,
	myLayer.Name,
	context.BuildpackInfo.SBOMFormats...,
)

- A new package sbomgen is created instead of adding the functionality
  to the existing sbom package because it helps buildpacks remove pinned
  "anchore/syft" lib from their go.mod which were flagged down by CVE
  scanners.
- I have not implemented pretty-fication of SBOM that the codepath that
  use syft go lib implements. This seems to be adding bloat to the app
  image and not supported via CLI. Consumers of SBOM can easily prettify
  the SBOM JSONs.
- In the codepath that use the syft go lib, license information is
  manually injected from buildpack.toml data into the SBOM. This is not
  available with the SyftCLIScanner. I couldn't find any reasoning for
  why this was done in the first place.
- I have intentionally not reused some code in methods that's mixed up
  with the syft go library with an intention to easily phase out that
  codebase in the near future.
@arjun024 arjun024 force-pushed the sbom-gen-with-syft-cli branch from 67038cc to 623f538 Compare September 14, 2024 13:11
@arjun024 arjun024 marked this pull request as ready for review September 14, 2024 13:17
@arjun024
Copy link
Member Author

@loewenstein This is a pretty small change, and I think should be seen as a measure to help remove packit's syft go code, without forcing packit consumers to switch to a new library. Something I didn't mention above is that this will easily help packit-based buildpacks to do what libpak-based buildpacks are currently doing (in terms of SBOM generation) with very little pain. It's a small step towards alignment of philosophies of both libraries.

@sophiewigmore
Copy link
Member

A new package sbomgen is created instead of adding the functionality to the existing sbom package because it helps buildpacks remove pinned anchore/syft lib from their go.mod which were flagged down by CVE scanners.

great idea

@arjun024 arjun024 merged commit 1ab5b00 into v2 Sep 17, 2024
7 checks passed
@arjun024 arjun024 deleted the sbom-gen-with-syft-cli branch September 17, 2024 17:55
// (Paketo RFCs 38 & 49).
func (s SyftCLIScanner) GenerateSBOM(scanDir, layersPath, layerName string, mediaTypes ...string) error {
sbomWritePaths := make(map[string]string)
args := []string{"scan", "--quiet"}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arjun024 , yes scan is the way to go now!
paketo-buildpacks/libpak#351

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
semver:minor A change requiring a minor version bump
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants