Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Sequential (Streaming) media types and link to registry #4518

Open
wants to merge 2 commits into
base: v3.2-dev
Choose a base branch
from

Conversation

handrews
Copy link
Member

This adds a link to the forthcoming media type registry (PR #4517), and also adds support for various sequential media types:

  • application/json-seq
  • application/jsonl
  • application/x-ndjson
  • text/event-stream

Given how various modeling and encoding techniques are scattered throughout the specification, the Media Types section seemed like the best place to add these, preceded by a link to he new Media Type Registry which will essentially be a catalog of where to find the existing guidance for various media types.

Also paging @robertlagrant and @disintegrator

  • schema changes are included in this pull request
  • schema changes are needed for this pull request but not done yet
  • no schema changes are needed for this pull request

@handrews handrews added the media and encoding Issues regarding media type support and how to encode data (outside of query/path params) label Mar 29, 2025
@handrews handrews added this to the v3.2.0 milestone Mar 29, 2025
@handrews handrews requested review from a team as code owners March 29, 2025 02:00
@handrews handrews changed the title Sequential media types and link to registry Sequential (Streaming) media types and link to registry Mar 29, 2025
@handrews
Copy link
Member Author

BTW I don't know that under "Media Types" is the right place for the sequential media type requirements. I could see it going as a new subsection under "Data Types", maybe? Or next to it?

It really does not feel like it should go under the Schema Object, as it's more about what you do or don't use with the Schema Object rather than how the Schema Object works in general. Once you map from the sequential format to the JSON Schema data model, the Schema Object behaves normally.

Several media types exist to transport a sequence of values, separated by some delimiter, either as a single document or as multiple documents representing chunks of a logical stream.
Depending on the media type, the values could either be in another existing format such as JSON, or in a custom format specific to the sequential media type.

Implementations MUST support modeling sequential media types with the [Schema Object](#schema-object) by treating the sequence as an array with the same items and ordering as the sequence.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wording is confusing to me, and doesn't seem to reflect the requirement that the Schema Object modeling the sequence must itself be of type: array.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no requirement that the Schema Object include type: array, although it would be a good practice.

What we're talking about here is not so much what to put in the Schema Object, but what data structure to convert the document to in order to use that data structure with the document.

Implementations don't get that from the Schema Object, they get that from these requirements, so it would be an error on the part of the implementation to pass anything but an array here. Of course, it's good practice to put the type: array in, and if you have other tools that depend on the type keyword and aren't paying attention to the media type with which the Schema Object is used, then you have to do that. But there's no requirement for it to be in the Schema Object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duncanbeevers my most recent commit (after a force-push that was a re-base of the unchanged original commit) added some clarification here, please see if that helps!

@handrews
Copy link
Member Author

The force-push just rebases the unchanged commit in order to get the syntax highlighting for text/event-stream.

This adds a link to the forthcoming media type registry, and
also adds support for various sequential media types:

* `application/json-seq`
* `application/jsonl`
* `application/x-ndjson`
* `text/event-stream`

Given how various modeling and encoding techniques are scattered
throughout the specification, the Media Types section seemed like
the best place to add these, preceded by a link to he new
Media Type Registry which will essentially be a catalog of where
to find the existing guidance for various media types.
In such use cases, either the client or server makes a decision to work with one or more elements in the sequence at a time, but this subsequence is not a complete array in the sense of normal JSON arrays.

OpenAPI Description authors are responsible for avoiding the use of JSON Schema keywords such as `prefixItems`, `minItems`, `maxItems`, `contains`, `minContains`, or `maxContains` that rely on a beginning (for relative positioning) or an ending (to determine if a threshold has been reached or a limit has been exceeded) when the sequence is intended to represent a subsequence of a larger stream.
If such keywords are used, their behavior remains well-defined but may be counter-intuitive for users that expect them to apply to the stream as a whole rather than each subsequence as it is processed.
Copy link

@ThomasRooney ThomasRooney Apr 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally wonder if this trade-off in slight confusion is worth it. The modelling of jsonl/sse in OpenAPI I've personally seen has always been for an indefinite-length stream, and I feel it might be a bit confusing for OAS authors and tool vendors to represent those as a type: array.

An alternative modelling is to have the schema model purely the JSON within the stream, and to validate the schema against each entry.

paths:
  /users/export:
    get:
      tags:
        - Users
      summary: Export user data in JSONL format
      description: >
        This endpoint returns user data in JSONL format, with each line containing a complete user record.
        This format is ideal for large datasets that need to be processed one record at a time.
      responses:
        '200':
          description: User data in JSONL format
          content:
            application/jsonl:
              schema:
                $ref: '#/components/schemas/User'
        '400':
          description: Invalid request
        '500':
          description: Internal server error
components:
  schemas:
    User:
      type: object
      required: [id, name, email]
      properties:
        id:
          type: string
          format: uuid
          description: Unique identifier for the user
        name:
          type: string
          description: User's full name
        email:
          type: string
          format: email
          description: User's email address
        age:
          type: integer
          description: User's age
        city:
          type: string
          description: User's city of residence

This approach has a few advantages for both JSONL and SSE. For JSONL, it:

  1. Matches the majority (I think all?) of examples I've come across in the wild from internal APIs.
  2. Is slightly simpler for tooling vendors to reason about.

Note

E.g. as Speakeasy, one of the client SDK generators as we convert the schema into a native type in each language, with application/jsonl indicative of purely the serialization/deserialization layer and wrapping of an the operation into some kind of Stream<T> response (where T is the subschema) in an SDK method. I.e. since type: array isn't directly exposed to users of an SDK, going with the proposed modelling we'd need to "unwrap"/special-case schemas at the top level as those impact the Stream, rather than the JSON within the stream. It might become similarly "messy" to implement for other vendors like API Gateway and documentation vendors.

An alternative modelling that supports/indicates a finite length JSONL response (note: we haven't actually seen any of these APIs yet, but my variant proposal otherwise closes the door on them) could be to represent that information within a new entry under the media type object, perhaps by following the example set by the encoding object:

paths:
  /users/export:
    get:
      tags:
        - Users
      summary: Export user data in JSONL format
      description: >
        This endpoint returns user data in JSONL format, with each line containing a complete user record.
        This format is ideal for large datasets that need to be processed one record at a time.
      responses:
        '200':
          description: User data in JSONL format
          content:
            application/jsonl:
              stream: # applicable for streaming media types only
                maxItems: 2
              schema:
                $ref: '#/components/schemas/User'
        '400':
          description: Invalid request
        '500':
          description: Internal server error

For SSE, there are also advantages. Consider the special data types for data, id, event defined by the text/event-stream media type. It's commonly modelled with something like this:

paths:
  /stock-updates:
    get:
      tags:
        - ServerSentEvents
      summary: Subscribe to real-time stock market updates
      description: >
       This endpoint streams real-time stock updates to the client using server-sent events (SSE).
       The client must establish a persistent HTTP connection to receive updates.
      responses:
        '200':
          description: Stream of real-time stock updates
          content:
            text/event-stream:
              schema:
                $ref: '#/components/schemas/StockStream'
        '400':
          description: Invalid request
        '500':
          description: Internal server error
components:
  schemas:
    StockStream:
      type: object
      description: A server-sent event containing stock market update content
      required: [id, event, data]
      properties:
        id:
          type: string
          description: Unique identifier for the stock update event
        event:
          type: string
          const: stock_update
          description: Event type
        data:
          $ref: '#/components/schemas/StockUpdate'

    StockUpdate:
      type: object
      properties:
        symbol:
          type: string
          description: Stock ticker symbol
        price:
          type: string
          description: Current stock price
          example: "100.25"

By continuing to represent the stream this way, we could open the door to richer modelling of the top level properties to also fit into the "encoding" object.

E.g. consider the "sentinel" event; something that's become popularised by the AI/LLM APIs by sending [DONE] as the last SSE data chunk. By avoiding the wrapping of the stream in type: array, we could enable the description of these media-type-specific entries in a standardized way through encoding, which will gracefully degrade if a tooling vendor doesn't understand the syntax because it's highly localized rather than "tainting" the JSON schema in the response body:

paths:
  /stock-updates:
    get:
      tags:
        - ServerSentEvents
      summary: Subscribe to real-time stock market updates
      description: >
       This endpoint streams real-time stock updates to the client using server-sent events (SSE).
       The client must establish a persistent HTTP connection to receive updates.
      responses:
        '200':
          description: Stream of real-time stock updates
          content:
            text/event-stream:
              encoding:
                event:
                  sentinel: '[DONE]]'
              stream:
                maxItems: 10
              schema:
                $ref: '#/components/schemas/StockStream'
        '400':
          description: Invalid request
        '500':
          description: Internal server error

By modelling it as type: array, it feels to me like we'd close the door on additional modelling of the top level fields outside of JSON Schema or extensions associated with the media type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ThomasRooney first, let me apologize for not tagging you in the original PR comment- I knew I was missing someone!

I'm going to take a while to think through this further, and also tag @gregsdennis who asked about this direction on Slack.

For now, I'll just state a few important principles that are guiding me here:

  • We model media types, and not protocols implemented on top of media types. There's nothing wrong with modeling protocols, but it can't be done by repurposing the media type layer. It would need a new mechanism, and that's too big of a change for 3.2, which needs to ship by this summer. Really, that would be better as a companion specification as it is beyond the current scope of the OAS.
  • The challenge here is that there's nothing in any of the JSON media types that says that every entry MUST be in the same format. If that were the case, then yes, the natural modeling would be to just model the single entry type. But we need to work with the media types as written, not as would make them more convenient. text/event-stream will tend to be more uniform, but there's no guarantee that someone won't use it in an unexpected way.
  • I feel like you're focusing on the response use case, but there are request use cases where the JSONL being sent is closer to a normal document.
  • I'm not that fixated on prefixItems, and in fact I think a more common relevant use would be to use maxItems as a way to limit the chunk size, although I do not know if that is ever actually done.
  • The Encoding Object is problematic for far too many reasons to get into here, and is due for a re-think in 3.3

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement media and encoding Issues regarding media type support and how to encode data (outside of query/path params)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants