-
Notifications
You must be signed in to change notification settings - Fork 587
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Support describing license properties and SPDX expression assertions #1577
Comments
Adjustments from an offline conversation:
This tweaks the structs a bit:
|
More adjustments from an offline conversation:
This tweaks the structs a bit more to distinct licenses between package/file discovery:
|
I am at a bit of a loss as to how What I tried to do with #1554 is replace It is possible that by this: type Licenses struct {
SPDXExpression string // expression used to derive the below licenses
Licenses: []License // licenses and their give metadata
} you meant that the authoritative license structure is the It fits a bit with: type License struct {
Nodes: []SpdxLicense // licenses and their give metadata nodes
Relations: []Edge // relation of license nodes for complex expressions
*LicenseEvidence
} but that separate listing of licenses and edges makes it hard to track them and likely to create errors. It might kind of fit with this later one: type PackageLicense struct {
ParsedExpression ParsedExpression
URL string // external sources
Location Location // on disk declaration
}
type ParsedExpression struct {
Value string
Nodes []SpdxLicense
} But has the same problem. I do see the structure of the last option here (not sure I agree on the distinction between Either way, why would we not have something like this: type Joiner string
const (
AND Joiner = "AND"
OR Joiner = "OR"
)
type PackageLicense struct {
ParsedExpression ParsedExpression
Raw string // original string that was parsed to give this parsed expression
URL string // external sources
Location Location // on disk declaration
}
type ParsedExpression struct {
Compound []ParsedExpression
Simple []string
Joiner
}
// Licenses list all of the individual licenses, independent of their relationships
func (p PackageLicense) Licenses() []string {
} |
Syft License Revamp
Syft currently represents license as different datatypes depending on the section of the schema it appears at:
Specifically, the
package
[]string
construct has proven to be a bit limited in how the data can be represented to a user interested in license compliance. Many packages now useSPDX LICENSE ID
to communicate FOSS license information. These identifier are currently incompatible with how we represent license given the complex nature of some of the constructs. Example:NOTE FROM COMMUNITY MEET:
The above shows a case where the consumer of the software can choose to use Apache-2.0 and one of the following: MIT, OR GPL-2.0-only.
The file is subject to both the Apache-2.0 license, and at the licensee’s choice either the MIT license or version 2.0 only of the GPL.
The licensee may choose between MIT and GPL-2.0.
Whichever they choose, they must comply with both that license and Apache-2.0.
Furthermore, syft's current licenses format is limited in representing the distinction between
DECLARED
vsCONCLUDED
The SPDX format gives implementers the choice in determining if a license should be in the
concluded
license field or thedeclared
license field:Concluded
TODO: Update this description based on feedback from community meeting
Contain the license the SPDX document creator has concluded as governing the package or alternative values, if the governing license cannot be determined.
If the Concluded License is not the same as the Declared License (7.15), a written explanation should be provided in the Comments on License field (7.16). With respect to NOASSERTION, a written explanation in the Comments on License field (7.16) is preferred. If the Concluded License field is not present in a package, it implies an equivalent meaning to NOASSERTION.
Declared
List the licenses that have been declared by the authors of the package. Any license information that does not originate from the package authors, e.g. license information from a third-party repository, should not be included in this field.
Syft's approach
Syft should enhance the license representation from
[]string
to[]License
in order to convey the above information more clearly. The following struct will be added in favor ofstring
to give downstream tooling more options in accurately reading how the license was determined at syft's run:Here is a sample of the json representation of the above:
In the event a license is successfully concluded the above uses google license classifier to accurately assess the license packaged with the software. If provides the confidence level (how close a match was given the locations contents compared to some source DB), the ofset (how far into the file the match was found), and the extent (how long the match was).
Why is this needed:
This enhancement is needed so syft can better represent SPDX license expression intentions, illustrate more data on where the license, if concluded, was found, and give downstream tools looking to use SBOM for license compliance more tooling/accuracy in assessing the license contents against policy they create.
The text was updated successfully, but these errors were encountered: