Pruning Documentation #421

r5h · 2024-09-02T06:18:00Z

r5h
Sep 2, 2024

I read your section on pruning, and it left me with these questions:

Presumably there’s a guarantee that if we prune updates from a stale client, we’ll get the correct update after that client updates. If that is true, it would be great to have that written out explicitly (reading the current section leaves me quite worried about dropping updates).
Isn’t it simpler to just opt out of stale syncs? Like if you try to sync to a server that has a newer schema version, just block syncing until the client is up to date

Sep 3, 2024

BTW, as you can probably tell, Verdant doesn't really 'solve' migrations necessarily. There's always a risk of data loss if you let older version clients continue pushing updates, it just uses several tricks to try to minimize that as much as possible, which I think are pretty successful.

The best way to ensure pruning doesn't affect your data integrity in any meaningful way would be to design schemas to be backwards-compatible, regardless of the fact that Verdant allows you to not do that if you please. Nullable fields, retaining old fields, etc-- all of this is still possible in Verdant for the integrity-conscious, but I prefer to live a bit more dangerously for the benefit of having a …

View full answer

a-type · 2024-09-03T15:12:10Z

a-type
Sep 3, 2024
Maintainer

Thanks for the feedback! I always struggle to tell if I'm communicating things well in the docs.

Pruning and data loss

Yes, as long as a client which can see (has not pruned) the data updates to the new schema, the document will be fully migrated and restored on all peers. There remains one edge case, where no older clients with that data ever upgrade and sync again (they remain offline forever / past the 'truancy' threshold). In that case, the document will remain pruned indefinitely. So I can't quite guarantee it, but these edge cases are extremely rare and remain recoverable. If the document is pruned entirely, it's as good as never created, but since the client which created it never came back, that's kind of ok in practice (no one else ever learned of its existence). If the document is partially pruned, you can still update it and write a new valid state to it.

One reason this is so rare is because any client running the old version has an opportunity to 'adopt' the pruned doc and upgrade it. This functionality does rely on PWA behavior and service workers caching the app's code, so it's kind of dependent on your app. But for most typical PWAs, the cached codebase (with an old schema) will load first, the new code will update in the background, and then it won't be run until the user closes the app or manually updates it. During that window where old code is running, the app has the opportunity to sync existing data and 'adopt' the pruned document into its database. Then, when the app is updated, it will migrate it.

I've spent a lot of time recently thinking about this edge case and whether there was a practical way to close it up, but ultimately I think this will be one flaw which remains unless Verdant is rewritten. The risk and the impact are not worth the complexity which would be required to close the loophole.

Opting out / server with new data

Pruning doesn't happen when an older-version client receives newer data, but rather when a newer-version client receives old (and unmigrated) data. Because migration can only happen once, when the new codebase is loaded (an unfortunate but probably permanent limitation), any data the client hadn't synced before it loaded up the new schema, and which is then synced after the migration, may have inconsistencies. This data can't just be ignored or blocked, since it's from an older schema version and indistinguishable from other older operations which are perfectly valid. We can only tell if something is wrong when the view of the data is constituted, and this is the point when pruning happens.

If a client with an older version syncs to a library which has operations from a new schema version it hasn't seen yet, by the way, it will just ignore those new operations and operate on the library as if it were still on its current version. It will emit an event called futureSeen, which it's recommended the app listen to and proactively nudge the user to update the app. This older client can interact with newer ones, and push operations (which will all be versioned to the older schema, and so "in the past" logically relative to its peers). But it will probably not see the same world as its peers and will not see their changes at all as they happen, since they will be new-versioned. Hence the event; the older client's user will almost definitely have diminished experience and many of the changes they make may be invalid in the new schema.

I hope this answers some questions. I'll try to pull out some ideas from this writeup and clarify the docs. I didn't know anyone was reading them! 😄

1 reply

a-type Sep 3, 2024
Maintainer

BTW, as you can probably tell, Verdant doesn't really 'solve' migrations necessarily. There's always a risk of data loss if you let older version clients continue pushing updates, it just uses several tricks to try to minimize that as much as possible, which I think are pretty successful.

The best way to ensure pruning doesn't affect your data integrity in any meaningful way would be to design schemas to be backwards-compatible, regardless of the fact that Verdant allows you to not do that if you please. Nullable fields, retaining old fields, etc-- all of this is still possible in Verdant for the integrity-conscious, but I prefer to live a bit more dangerously for the benefit of having a cleaner schema, which is also allowed.

One aspect of the nature of web apps which is highly in our favor here is that they are served from the internet, which acts as an excellent update-delivery mechanism. Unlike locally installed software, it's much harder to stubbornly stay on an old version of a PWA, to the point where it's nearly impossible to do so for more than a few hours while still syncing data (i.e. still connected to the internet; i.e. will receive the new code and be forced to update to it on next launch).

Answer selected by r5h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pruning Documentation #421

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Pruning Documentation #421

r5h Sep 2, 2024

Replies: 1 comment · 1 reply

a-type Sep 3, 2024 Maintainer

Pruning and data loss

Opting out / server with new data

a-type Sep 3, 2024 Maintainer

r5h
Sep 2, 2024

Replies: 1 comment 1 reply

a-type
Sep 3, 2024
Maintainer

a-type Sep 3, 2024
Maintainer