Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Discovery of SBOMs on the Rekor transparency log #1159

Closed
mdeicas opened this issue Aug 11, 2022 · 1 comment
Closed

Discovery of SBOMs on the Rekor transparency log #1159

mdeicas opened this issue Aug 11, 2022 · 1 comment

Comments

@mdeicas
Copy link
Contributor

mdeicas commented Aug 11, 2022

Motivation

It is not always possible to look inside executables and report accurate information on their contents and dependencies. This information is accessible at the build time of executables, but there has been no general way to propagate this data to a later stage in the software supply chain.

With the development of the Sigstore supply chain security infrastructure, it is now possible to access information from the build time of artifacts. This issue and related PRs propose a way to incorporate this information into Syft.

This PR is part of the broader picture to allow Syft to handle finding SBOMs (#737) and to enable the use of external sources (#1115).

The rekor-cataloger

#1157 contributes a package which can search the Rekor transparency log for information about SBOMs of executables, and the rekor-cataloger, the integration point between the package and Syft.

Demo

To demo the rekor-cataloger, run Syft on an image containing binaries that have SBOMs on Rekor. One such image is here https://hub.docker.com/r/mdeicas/sample-golang-prov.

syft packages sample-golang-prov.tar -o spdx-json --file spdx.json --catalogers all

This is the diff between an execution of syft with and without this PR:

+[0000] DEBUG cataloging with "rekor-cataloger"
+[0000] DEBUG rekor is being queried for 
+               Location: /sample-golang-prov 
+               SHA256: f2e59e0e82c6a1b2c18ceea1dcb739f680f50ad588759217fc564b6aa5234791
+[0000] DEBUG rekor entry 2790629 was retrieved
+[0000] DEBUG verification of rekor entry 2790629 complete
+[0000] DEBUG SBOM (798688 bytes) retrieved
+[0001] DEBUG rekor entry 2790625 was retrieved
+[0001] DEBUG verification of rekor entry 2790625 complete
+[0001] DEBUG error parsing or validating attestation associated with rekor entry 2790625: 
+               the attestation predicate type (https://slsa.dev/provenance/v0.2) is not the accepted type (google.com/sbom)
+[0001] DEBUG relationship created for SBOM found on rekor
+[0001]  WARN 
+                       [EXPERIMENTAL FEATURE: Rekor-cataloger] 
+                       
+                       This SBOM contains a relationship that references an external document. This 
+                       document is not present in the cataloged image or directory; rather it has 
+                       been found by searching the Rekor transparency log (https://www.sigstore.dev/).  
+                       
+                       Trusting this external document relationship requires trusting several entities: 
+                               - the user or CI/CD action that uploaded an entry to Rekor
+                               - Rekor transparency log
+                               - Fulcio CA
+
+                       The Rekor entry(s) that were used to create the external document relationship(s)
+                       are listed below by UUID. See https://github.com/sigstore/rekor for 
+                       information on how to query Rekor. 
+                               [362f8ecba72f432677c5a08384c08e85445632ae4078b94fff43651770e12eb1d3ca43e45fae3a15]
+
+[0001] DEBUG discovered 0 packages

How the rekor-cataloger works

Upon finding an executable, Rekor is searched by hash. The log entries and associated SBOMs are retrieved and verified, and relationships are created. The SBOM information is obtained from an in-toto attestation (https://github.com/in-toto/attestation) associated with the Rekor entry. Here is an example:

{
  "_type": "https://in-toto.io/Statement/v0.1",
  "predicateType": "google.com/sbom",
  "subject": [
    {
      "name": "binary-linux-amd64",
      "digest": {
        "sha256": "f2e59e0e82c6a1b2c18ceea1dcb739f680f50ad588759217fc564b6aa5234791"
      }
    }
  ],
  "predicate": {
    "sboms": [
      {
        "format": "SPDX",
        "digest": {
          "sha256": "02948ad50464ee57fe237b09054c45b1bff6c7d18729eea1eb740d89d9563209"
        },
        "uri": "https://github.com/user/repo/releases/download/v1.3/binary.spdx"
      }
    ]
  }
}

The SBOM that is output by Syft uses external reference relationships to refer to the SBOMs discoverd by the rekor-cataloger. Merging the SBOMs was considered to be an optional follow-up feature, and is still under investigation (#617).

The rekor package exports an ExternalRef type that represents information about an external sbom. It is an identifiable, and is placed into a Syft relationship to upstream the information. When mapping the Syft SBOM format to other formats, relationships with ExternalRefs are handled in accordance with each format’s specification. In SPDX, they appear in the external reference documents section in addition to being referenced in a relationship. Here is an example (edited):

...
"externalDocumentRefs": [
  {
   "externalDocumentId": "DocumentRef-24a791393ed162b5",
   "checksum": {
    "algorithm": "SHA1",
    "checksumValue": "eb141a8a026322e2ff6a1ec851af5268dfe59b20"
   },
   "spdxDocument": "http://www.example.com/binary.spdx"
  }
 ]
...
 "files": [
  {
   "SPDXID": "SPDXRef-9dc5bd9a21b3b63c",
   "comment": "layerID: sha256:be555362a16f0f6b27f194ed8fc0fd5b640a300f809eafe5799676a53bbcfc7b",
   "licenseConcluded": "NOASSERTION",
   "fileName": "/sample-golang-prov"
  }
 ]
...
 "relationships": [
  {
   "spdxElementId":"SPDXRef-9dc5bd9a21b3b63c",
   "relationshipType": "DESCRIBED_BY",
   "relatedSpdxElement": "DocumentRef-24a791393ed162b5"
  }
]
...

The rekor package can only read log entries that are associated with in-toto attestations. The content of the SBOM that is referenced in the attestation must successfully be retrieved to continue execution, and only SPDX SBOMs can be read.

Managing external sources

The use of external sources is new to Syft, and they should be managed carefully (i.e. configurability, clear to users what has been used and how). Accordingly, #1158 introduces a new external sources configuration, an additional function that catalogers must implement, and a cli flag to shut off the use of external sources. This approach assumes that external sources will only come into Syft through catalogers.

Separate from that PR, rekor-cataloger logs a warning indicating what was used to create the output SBOM (see the log output above).

Verification of data

The use of external sources requires verification of data that is found. Absent inconsistencies that are outlined below, the rekor-cataloger currently accepts all Rekor entries that have certificates issued by Fulcio. In the future, the rekor-cataloger can be extended to limit accepted entries to ones that match specific identities.

To explain the verification actions that are taken, simplified depictions of the Rekor log entry and in-toto attestation data formats are shown here:

Attestation:
    subject:
        hash (this is the hash of the binary)
    predicate:
        sbom-hash  
        sbom-uri

Rekor log entry:
    timestamp 
    attestation-hash
    certificate

The rekor package retrieves the Rekor log entry, the associated in-toto attestation, and the SBOM. It performs verification to ensure that the retrieved data has not been tampered with. It verifies that:

  • the log entry has been signed by Rekor’s public key
  • the certificate chains back to a Fulcio root certificate
  • the log entry timestamp lies in the period of validity of the certificate
  • the attestation-hash equals the hash of the attestation that is obtained
  • the attestation.subject hash equals the hash of the binary that is being searched for
  • the attestation.predicate sbom-hash equals the hash of the sbom bytes retrieved from sbom-uri

These steps ensure that the retrieved information, and the upstream external document reference that is produced, can be trusted if Rekor, Fulcio, and the certificate subject are trusted.

A current limitation of Rekor entries for in-toto attestations does not allow the verification of the certificate subject’s signature over the attestation (sigstore/rekor#582). Once this is possible, Rekor will not need to be trusted.

When a builder, such as the slsa-github-generator (https://github.com/slsa-framework/slsa-github-generator), generates the SBOM and uploads it to Rekor, a path from source code to SBOM is created. In this case, the only trust predicates are the builder and Fulcio.

Surfacing packages versus surfacing binaries

Edit: I realized that Syft can create files, not just packages. Binaries can be represented using files, and the below doesn't apply anymore 😃.

External document references that the rekor-cataloger produces must be related to SBOM entries for executables as opposed to entries for the packages they contain (in-toto attestation subjects are executables, not packages). Currently, Syft only surfaces packages. Binaries that are found, but that cannot be looked inside of, do not appear in the SBOMs output by Syft.

This PR includes a temporary solution to allow the use of the rekor-cataloger for golang binaries. It involves a change (see commit titled “surface external relationships”) to the golang-binary-cataloger to create SBOM entries not only for the packages that executables contain, but also for the executables themselves. This allows the rekor-cataloger to create external reference relationships using the entries for golang executables.

Since no entries are created for binaries that are not golang-compiled, the results from the rekor-cataloger for them will not appear in output SBOMs. Another implication is that the rekor-cataloger cannot be run without the golang-binary cataloger, as rekor-cataloger does not itself create packages.

This also raises the larger question of whether Syft should only surface an executable when it can provide meaningful information for it. The current design prevents the rekor-cataloger’s ability to report information in the output SBOM, but also should raise wider questions about how the completeness of SBOMs output by Syft is perceived. This topic is out of the scope of this issue.

Follow up work

  • Map relationships with ExternalRefs to Cyclone, SPDX TV formats, and Syft JSON (only SPDX JSON was implemented)
  • Verification of the signature over the attestation (blocked by in-toto records don't contain signatures sigstore/rekor#582)
  • Better linking between SBOMs:
    • Currently, the Syft output SBOM asserts that a binary is DESCRIBED-BY an external SBOM
    • Ideally, we want to link, using some equivalency relation, the entries in the two SBOMs for the same binary.
  • Ability to specify which MIME types the rekor cataloger runs on
  • Support decoding of any SBOM format for SBOMs that are found, not just SPDX TV
  • Add an SHA256 field to the external document reference (only SHA1 now)
spiffcs pushed a commit that referenced this issue Aug 24, 2022
This PR adds the ability to discover build-time SBOMs from binaries with the Rekor transparency log. 
It does this by creating external document references for them in SPDX JSON. 

Explained in more detail in syft issue #1159
spiffcs pushed a commit that referenced this issue Oct 21, 2022
This PR adds the ability to discover build-time SBOMs from binaries with the Rekor transparency log.
It does this by creating external document references for them in SPDX JSON.

Explained in more detail in syft issue #1159

Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
spiffcs pushed a commit that referenced this issue Oct 21, 2022
This PR adds the ability to discover build-time SBOMs from binaries with the Rekor transparency log.
It does this by creating external document references for them in SPDX JSON.

Explained in more detail in syft issue #1159

Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
spiffcs pushed a commit that referenced this issue Oct 25, 2022
This PR adds the ability to discover build-time SBOMs from binaries with the Rekor transparency log.
It does this by creating external document references for them in SPDX JSON.

Explained in more detail in syft issue #1159

Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
spiffcs pushed a commit that referenced this issue Oct 25, 2022
This PR adds the ability to discover build-time SBOMs from binaries with the Rekor transparency log.
It does this by creating external document references for them in SPDX JSON.

Explained in more detail in syft issue #1159

Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
@kzantow kzantow added this to OSS Nov 14, 2022
@kzantow kzantow moved this to Backlog (Pulled Forward for Priority) in OSS Nov 17, 2022
@popey
Copy link
Contributor

popey commented Jun 20, 2024

Closing as this is superseded by #1291 🙏

@popey popey closed this as completed Jun 20, 2024
@github-project-automation github-project-automation bot moved this from Backlog to Done in OSS Jun 20, 2024
@wagoodman wagoodman closed this as not planned Won't fix, can't repro, duplicate, stale Jun 20, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

3 participants