From 9d40eedbd6bf96ba0a5dab21d25a0cfd2136a1f0 Mon Sep 17 00:00:00 2001 From: Adam Wu Date: Tue, 31 Oct 2023 17:05:48 +0000 Subject: [PATCH] Add UTF-8 encoding option to the envelope document and proto. Signed-off-by: Zhenyu (Adam) Wu Co-authored-by: Mark Lodato --- envelope.md | 35 +++++++++++++++++++++++++++-------- envelope.proto | 11 ++++++++--- 2 files changed, 35 insertions(+), 11 deletions(-) diff --git a/envelope.md b/envelope.md index 093b50b..6ba1641 100644 --- a/envelope.md +++ b/envelope.md @@ -2,7 +2,7 @@ March 03, 2021 -Version 1.0.0 +Version 1.1.0 This document describes the recommended data structure for storing DSSE signatures, which we call the "JSON Envelope". For the protocol/algorithm, see @@ -16,9 +16,12 @@ to define the schema. JSON is the only recommended encoding.) The standard data structure for storing a signed message is a JSON message of the following form, called the "JSON envelope": -```json +```jsonc { - "payload": "", + // Exactly one of the following must be set: + "payload": "", + "payloadUtf8": "", + // End oneof "payloadType": "", "signatures": [{ "keyid": "", @@ -29,9 +32,22 @@ the following form, called the "JSON envelope": See [Protocol](protocol.md) for a definition of parameters and functions. -Base64() is [Base64 encoding](https://tools.ietf.org/html/rfc4648), transforming -a byte sequence to a unicode string. Either standard or URL-safe encoding is -allowed. +Exactly one of `payload` or `payloadUtf8` MUST be set: + +- `payload` supports arbitrary SERIALIZED_BODY. + [Base64Encode()](https://tools.ietf.org/html/rfc4648) transforms a byte + sequence to a Unicode string. Base64 has a fixed 33% space overhead but + supports payloads that are not necessarily valid UTF-8. Either standard or + URL-safe encoding is allowed. + +- `payloadUtf8` only supports valid + [UTF-8](https://tools.ietf.org/html/rfc3629) SERIALIZED_BODY. `Utf8Decode()` + converts that UTF-8 byte sequence to a Unicode string. Regular JSON string + escaping applies, but this is usually more compact and amenable to + compression than Base64. + +Note: The choice of `payload` vs `payloadUtf8` does not impact the +[the signing or the signatures](protocol.md#signature-definition). ### Multiple signatures @@ -54,8 +70,8 @@ envelopes with individual signatures. ### Parsing rules -* The following fields are REQUIRED and MUST be set, even if empty: `payload`, - `payloadType`, `signature`, `signature.sig`. +* The following fields are REQUIRED and MUST be set, even if empty: + exactly one of {`payload` or `payloadUtf8`}, `payloadType`, `signature`, `signature.sig`. * The following fields are OPTIONAL and MAY be unset: `signature.keyid`. An unset field MUST be treated the same as set-but-empty. * Producers, or future versions of the spec, MAY add additional fields. @@ -75,5 +91,8 @@ At this point we do not standardize any other encoding. If a need arises, we may do so in the future. ## Change history +* 1.1.0: + * Added support for UTF-8 encoded payload and `payloadUtf8` field. + * 1.0.0: Initial version. diff --git a/envelope.proto b/envelope.proto index 83d7602..34d8df3 100644 --- a/envelope.proto +++ b/envelope.proto @@ -4,10 +4,15 @@ package io.intoto; // An authenticated message of arbitrary type. message Envelope { - // Message to be signed. (In JSON, this is encoded as base64.) + // Message to be signed. // REQUIRED. - bytes payload = 1; - + oneof payload_encoding { + // Raw bytes. In JSON, this is encoded as base64. + bytes payload = 1; + // Unicode string, where the signed byte stream (SERIALIZED_BODY) is the UTF-8 encoding of `payloadUtf8`. In JSON, this is a regular string. + string payloadUtf8 = 4; + } + // String unambiguously identifying how to interpret payload. // REQUIRED. string payloadType = 2;