-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Add a version indicator to the various DMDs we create. #420
Comments
We already have a version in the snapshot metadata. |
I don't think we need an immediate version now. If we do need a second version in the future, we can add one using an "impossible" / invalid-snapshot-name like you suggest and that can be "version 2" essentially. ..or, we could just add additional information that way and "never" need a version for the directories themselves. Or we could use the Tahoe metadata for this. So closing this in favour of doing #395 and nothing else. |
We'll need to ensure that we notice a file with an "invalid" name, before doing any processing on the folder, so that we notice that we have an incompatible folder before doing anything with it. |
To expand on that, if we depend on just #395, we need to somehow ensure that the entry we use gets processed before anything else. Looking at the current code, the downloader appears to use a dictionary for files, which at least on py2.7 will be in a random order, and I don't think the uploader ever looks at the DMD, so we have no way to reliably notice a newer version of the DMD in either of the processes. |
That is addressed by #395
Yes, if we introduce a second version or anything else, we'd need to add code (for that plus probably other things). We don't need that code now (nor do we need a version marker now). |
We might only notice it after we made changes to either the DMD or the file system, which might cause issues. |
Are you saying it's impossible to add |
I'm saying it is impossible[1] to add it in such a way that the current version will notice that it is dealing with a later version early enough to avoid doing something with the DMD that is inappropriate because we've changed the format in some incompatible way. Now, I don't what incompatible changes we might make, or what could go wrong if an older client were to try to read or write such a version, but the cost of adding a version field now is low enough, that we should do it now, before we have an installed base to worry about. One thing to consider, is that if we need to depend on the the behaviour of old clients not breaking a newer incompatible format because they error out early, that it may get harder to do, the longer it is before we introduce an incompatible change, since there will be more versions we'll need to ensure trigger an error early enough. [1] Maybe not entirely impossible, but very much constrained by incidental implementation details. Have a look at the discussion here, on the hoops that were needed to be dealt with to be able to extend a protocol that didn't originally have it. |
Thinking about this a bit more, in terms of implementation, I think it may make sense to also make sense to carve out some space that the initial version will ignore, where we can add options or metadata for newer versions, that are not incompatible. (That is, options or metadata that can safely be ignored by a client that doesn't understand them). That would give us some space for compatible extensibility as well. |
All we have to do now is ignore "impossible" names like |
We'd need to error out on finding a name like |
I'm specifically considering the possibility of us wanting to make a change in the future, where the current uploader or downloader code would do something bad with a new version of the DMD, even after ignoring impossible names. I don't have a specific example, but I suspect we will encounter one, if the project is successful. |
Why? What's wrong with saying we ignore anything that starts with |
For a lot of changes, that may be enough. I am presuming that at some point, we will want to make a change where just ignoring those files will cause a severe issue, such as data loss or something. I don't have a specific example, but experience (both mine and others) suggests that not having a such mechanism is something that we will eventually regret. In some ways, ignoring things would be worse than erroring with something unexpected. At least you would get an error after some amount of time, rather than silently corrupting data or doing whatever bad thing that happens. |
I think that having an explicit schema for persistent data is extremely important. It is certainly possible to play tricks where the schema is only partially defined (like "@Version is an illegal name so we can always encroach on that part of the namespace in the future") but these are much more difficult to reason about in general and it only gets harder to do so as the software continues to change. Recording a version number in the data and defining the schema elsewhere (eg, in the implementation) is fine. Sometimes you get the first one for free by saying "if the [schema] version isn't explicitly defined, it's 0". Ideally the space in the persistent data for the version would still be protected by the implementation. I think So the minimum necessary to make magic-folder (let's call it) 2 safe with respect to magic-folder >2 is to check for One thing we do want to consider right now is whether some particular versioning scheme lets us atomically retrieve DMD contents and know their version. A scheme that involves two independent reads probably does not allow this. If I think this still means we can treat "lack of @Version" as meaning "[schema] version 0" and only do the implementation work to have magic-folder create a [schema] version 1 "@Version" when we eventually change the schema. |
Yes, checking and erroring (before reading or writing the rest of the DMD) on a |
There is the question of if we want to notice old tahoe-based magic-folders, and error on them, since they aren't compatible. Adding a file now would allow us to do that. But that isn't a hard requirement for future compatibility. |
Under what scenario would we encounter something like that? |
Somebody tries to migrate from tahoe-based magic-folder by adding the DMD as a participant or something? |
If we aren't already, we should be explicit that if you screw up while using the low-level API -- which is what "add participant" is -- then you own all the pieces. We don't need to guard against people doing weird stuff like that since it's pretty far outside what any user will be doing (actual users will add participants via "invites"). I think what @exarkun is suggesting (and I agree) is that we can say "a mutable directory with no If even that is insufficient in some weird pathological case nobody has even imagined as of yet then we can always fall back on the Snapshots versions to bootstrap something else. |
I agree that it isn't necessary, but it does seem like a nice quality-of-life feature, that we may not get to in time. I will note that, right now, the low-level API is what is documented in #381 as the way to recover, so, if that were to land, it doesn't seem too far outside what a user will be doing. Slightly more plausible, but with the same behaviour, is if gridsync doesn't differentiate between backups from old-magic-folders and new-magic-folders, it could mistakenly pass the DMD to the former to us. Again, let me re-iterate, I'm not suggesting that handling this particularly case is necessary, but it does seem like a plausible thing that could happen, that it would be nice to guard against. |
No, the user will be using GridSync to do the recovery flow. The document you're referencing is explicitly for "other developers integrating with magic-folder". Anyway, we're in agreement that there's nothing to be spec'd or implemented here? |
No. We need to implement checking for a |
I don't think that is sufficient, at least if we implement #395 by making the conversion functions raise an error. We would need to proactively check that there are no such files before reading or writing to a DMD. In any case, I think having an explicit check, as @exarkun explained, is easier to reason about, so we should implement this directly. |
I take @exarkun's comment to mean "nothing to do here now, because we can add a This future-proofs current clients because they will error upon seeing the invalid name (if, say, a new magic-folder client created a new DMD with I can't even imagine how that might happen, though, because we have no user-visible way to add a new participant (and when we do, it'll be via the magic-wormhole invite process so we could always just stop there instead of messing about with DMD contents). |
They don't error soon enough, with just #395 implemented, as demonstrated by the integration test in https://github.com/tp-la/magic-folder/commit/038fbdc4b56962bf33caf6d0e358edf59858d96a. We don't notice that there is a |
It could happen if somebody tries to recover a folder from a newer magic-folder with an old magic-folder. (That is, via the corresponding gridsync in both cases) |
I can imagine that at some point in the future, we'll want to change the structure of DMDs, or add extra non-snapshot data to them. We should add version metadata to them (perhaps a
@version
file, which won't conflict with any of the snapshot entries, see #395).We can then check that version before doing anything with a remote DMD.
It probably also makes sense to include the version in the snapshot metadata, for similar reasons.
The text was updated successfully, but these errors were encountered: