Replies: 4 comments 4 replies
-
another place version could appear is in the lexicon NSID, will add this on next update already added in working copy here: https://github.com/blebbit/lexicon/blob/main/notes/versioning.md |
Beta Was this translation helpful? Give feedback.
-
I am personally in favor of using the if you have an application listening to the firehose for specific events, you could add a filter for |
Beta Was this translation helpful? Give feedback.
-
I'm a fan of having a special Consider the following schema where in an alternative universe the {
"lexicon": 1,
"id": "community.lexicon.bookmarks.bookmark",
"defs": {
"main": {
"type": "record",
"revision": "2",
"description": "Record bookmarking a link to come back to later.",
"key": "tid",
"record": {
"type": "object",
"reserved": [
"post"
],
"required": [
"subject",
"createdAt"
],
"properties": {
"subject": {
"type": "string",
"format": "uri"
},
"createdAt": {
"type": "string",
"format": "datetime"
},
"tags": {
"type": "array",
"description": "Tags for content the bookmark may be related to, for example 'news' or 'funny videos'",
"items": {
"type": "string"
}
}
}
}
}
}
} In use: {
"$type": "community.lexicon.bookmark",
"$rev": "2",
"subject": "at://did:plc:xyz123/app.bsky.feed.post/abc456",
"createdAt": "2024-09-13T08:00:00.000Z"
} This approach is minimal, but intentionally. I think that schema files don't need to convey the entire history, but instead should only show what fields are available, expected, and required. Version control is a solved problem and I think it would be over-prescriptive to stuff a lot of additional context of what was added, removed, or changed in a lexicon and why within the schema. Having a minimal |
Beta Was this translation helpful? Give feedback.
-
It occurs to me, people will likely want to say "these lexicon all go together at this version" which means we need to associate different lexicon files / records with each other at specific versions, as a group. At some point it begins looking exactly like modules & dependencies from most language ecosystems |
Beta Was this translation helpful? Give feedback.
-
Lexicon Versioning
Versioning is a useful mechanic for applications,
both for themselves, their dependencies, and of the payloads they process.
In this regard, Lexicon are the schemas in ATProto
and applications on the network could benefit from their versioning.
Note, examples are written in CUE for brevity.
Also, I don't know why the rendering of the markdown is using the newlines between lines which are not separated by multiple newlines. It has better formatting here: https://github.com/blebbit/lexicon/blob/main/notes/versioning.md This will also be the most up-to-date place as I work on this. (related https://github.com/orgs/lexicon-community/discussions/32)
ATProto Today
The ATproto spec has the following to offer us
Lexicon Files
from (https://atproto.com/specs/lexicon#lexicon-files):
lexicon
(integer, required): indicates Lexicon language version. In this version, a fixed value of1
id
(string, required): the NSID of the Lexiconrevision
(integer, optional): indicates the version of this Lexicon, if changes have occurredNote, in practice, the revision field is not used. I'm not sure why.
Lexicon Evolution
from: (https://atproto.com/specs/lexicon#lexicon-evolution)
Lexicons are allowed to change over time, within some bounds to ensure both forwards and backwards compatibility. The basic principle is that all old data must still be valid under the updated Lexicon, and new data must be valid under the old Lexicon.
If larger breaking changes are necessary, a new Lexicon name must be used.
It can be ambiguous when a Lexicon has been published and becomes "set in stone". At a minimum, public adoption and implementation by a third party, even without explicit permission, indicates that the Lexicon has been released and should not break compatibility. A best practice is to clearly indicate in the Lexicon type name any experimental or development status. Eg,
com.corp.experimental.newRecord
.Version Identifiers
There are various versioning schemes, some examples are
(in increasing flexibility order)
revision
, a monotonic intapiVersion
, avX
with an optional{alpha,beta}Y
We can use or represent versioned lexicon in several ways today.
Monotonic Int (using the ATProto Lexicon.revision)
version 1:
version 2:
It is unclear to me how one refers to a specific revision of a lexicon today
Name with Version Suffix
Bluesky has the following pattern in their own Lexicon.
(atproto/lexicons/app/bsky/actor/def.json)
(
ref: "app.bsky.actor.defs#savedFeedsPrefV2"
)Kubernetes Style
Kubernetes uses
v1
andv2alpha2
version segments for theirapiVersion
field.This can be seen as an extension to what Bluesky has done themselves,
by adding a maturity component to the end of the major version.
They can already be used in the scheme they are using above.
Kubernetes also prefixes versions in
apiVersion
with an NSID,but I'm going to set that aside for this document because
we have similar information in the lexicon id.
We could also set the version as the defs field names themselves
if we want to use independent Lexicon instead of the defs pattern.
We can then refer to a specific version using fragments,
where we gain an amount of separation between name and version.
Using
main
could be the equivalent of "latest" (which isn't a version).Semver Style
This would work like the previous examples,
but with semver def names or suffixes,
assuming the charset needed is valid in the ATProto spec.
Discussion
Today, with no one using revisions.
We are essentially always using the "latest" version of a Lexicon.
If we publish a new version, consuming applications will start using it,
and can break from externally changing factors beyond their control.
We could declare this is the expected behavior and contract, but I think we can do better.
Application developers would benefit from having some amount of control
over the versions they use for dependencies beyond their control.
Even Bluesky has found versioning useful for their own Lexicon,
as evident with
app.bsky.actor.defs#savedFeedsPrefV2
.The ATProto spec says we should not ship backwards incompatible changes,
but in practice this is unrealistic.
Indeed, Bluesky has shipped "breaking changes" themselves,
between
#savedFeedsPref
and#savedFeedsPrefV2
.Doing this is valid and allowed within the Lexicon spec
because you are only "adding new fields".
Is the Bluesky application filling in both fields when a user updates
their preferences today? Are older app views that only understand
v1
seeing those updates?Monotonic int gives us the most basic versioning on the full lexicon,
while using
fieldVX
give us this versioning within a lexicon, but still onfull defs as is done in the
app.bsky.actor.defs#savedFeedsPrefV2
.When using the field level versioning of defs, omitting the lexicon
revision
is probably the correct thing to do so you are always getting the most up to date
list of available versions. We are essentially publishing every version forever.
Both options lack the ability to express maturity like
alpha|beta
ormajor.minor.patch
.Kubernetes style is an extension of the
fieldVX
and would give us maturity markers.Semver is common and widely adopted, offering the greatest flexibility,
with both maturity and breaking change semantics. (
major.minor.patch-<extra>
)Where do we set the version?
We should also consider where the version is specified.
Ideally the version is separate from the record details,
as is with the
revision
field on Lexicon.The methods we see being used merge the name and version into a single string.
This, in example, complicates both the construction and decomposition of a ref
if you want to present a different view of a record depending on its version.
Without a clear delineation marker, this makes the decomposition even more difficult.
In order to have richer versioning as a stand alone field
would require changing the spec, something I would support.
At this point, I prefer the
vXbetaY
(Kubernetes style).Another consideration for version location is the depth or scope of versioning.
Are we versioning the full lexicon or definitions within them?
Should the practice of versioning Lexicon like Bluesky has be recommended against?
(with
app.bsky.actor.defs#savedFeedsPrefV2
and"v1"
intermixed with other defs)Is the better practice to make them separate lexicon? (using the
revision
field,which would be equivalent, at least in terms of information)
Other
@sdboyer also has some interesting ideas and insights around many interacting components
with lots of versioning of the objects and nested references.
Schemas should be able to evolve and we should also be able to express
how we move between versions directly in the schema system.
This is some pretty advance stuff and is a good vision to keep in mind.
Even without all of this, there are complexities in a system with lots
of records, each having their own version, and referring to each other at various version.
Sam can surely articulate these better than I can.
https://github.com/grafana/thema is the CUE project that implements these ideas.
Beta Was this translation helpful? Give feedback.
All reactions