Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add repo.load #96

Closed
wants to merge 3 commits into from
Closed

Add repo.load #96

wants to merge 3 commits into from

Conversation

jessegrosjean
Copy link
Contributor

I'm not sure that it's the right design, but I'm playing with a bundle repo.load implementation to see what it looks like.

I think it could be used as a safer alternative to create(id). I also think it could be a way to reverse a repo.delete.

The general idea is a new repo operation that has create or update behavior. I think it could be a good fit for both loading external NSDocument stored data, and it also I "think" makes sense for reversing a delete.

Of course this pull isn't ready, but I figured I would make it so we could have some code to look and and place to attach discussion.

@@ -508,6 +508,61 @@ public final class Repo {
let resolved = try await resolveDocHandle(id: handle.id)
return resolved
}

public struct Bundle {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This already exists as DocHandle, so I'd prefer to use that instead of Bundle here

/// - If existing document with same ID is known to repo then this will merge bundle.doc with existing
/// - If deleted document with same ID is know to the repo then this method will replace that state with new document handle
/// - If repo don't know the ID then will behave like create
public func load(bundle: Bundle) async throws -> DocHandle {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is totally bikeshedding, but what do you think about the idea of this function being add or import instead? I half-wanted to keep load for the concept of loading through a StorageProvider without inquiring over the network to peers (if any are connected)

// immediatly availible. Any pending action (load, request, maybe other stuff) on
// that ID should be incorported into loaded document.
if documentIds().contains(bundle.id) {
let handle = try await find(id: bundle.id)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

find() implies the following:

  • create an InternalDocHandle, initial state of .idle
  • attempt to load from any defined persistent storage location, if one is configured (.loading)
  • if that fails, ask for the ID from any connected network peers (.requesting)
  • throw Unavailable if nothing else works

The states for InternalDocHandle are defined at https://github.com/automerge/automerge-repo-swift/blob/main/Sources/AutomergeRepo/InternalDocHandle.swift#L7C5-L14C6

I think instead you'd want to access the list of internal doc handles (handles) by documentId. If one doesn't exist, add it. If it does exist, then do the logic to merge if there's a document already in place. If you're expecting a document to be in persistent storage already, and want to allow the whole "merge" thing, then currently find() IS the best place for that.

Kind of an open question of what you'd want to happen if a handle was there but in a state of .requesting or .loading. The state of InternalDocHandle is driven by what amounts of recursive calls to repo.resolveDocHandle(:id) - which takes advantage of the asynchronous flows to do the state sequence.

In practice, those states mean that a find() is in progress but hasn't yet finished (or maybe has, and returned an Unavailable expection), and you're concurrently (or later) appearing to invoking this function.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized after writing this that I was making an assumption, perhaps incorrectly, that you wanted to perhaps check the storage provider without checking a network provider. The state sequence with the InternalDocHandle and the resolveDocHandle(:id) doesn't provide that path today. But it's lacking in several places, including the idea that the filesystem could also present events that indicate that files on disk have been updated and should be re-read (the idea of StorageProvider becoming an active contributor to possible updates, akin to NetworkProvider, rather than just passive)

It may be that what we're doing with states that, while replicated from the JS project, doesn't match well to iOS/macOS needs. and that might need a re-think in how we approach handling all these potentially different incoming events that mean merging, updating, storing, requesting, or sharing Automerge Documents.

let resolved = try await resolveDocHandle(id: handle.id)
return resolved
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think along with this, we should consider dropping create(id:doc) and create(id:data) - or deprecating those and pointing to this as a rename, but this early in the game I think we could just drop that API as renamed and be good, we're only at version 0.1.1, so it's a bump to 0.2.0 when we cut a release, but that makes a lot of sense to me (and is less overhead)

@jessegrosjean
Copy link
Contributor Author

jessegrosjean commented May 15, 2024

Before I get too deep into this, I'm happy to proceed with your continued feedback. I'm also happy for you to just take over with your own implementation. I think we agree on core needs. Let me know what you think is best.

Edit I just noticed your help wanted on network availability feed. Maybe we can trade. I can take that over and I'll leave load/import for you? I feel like proper implementation of this load/import command requires quite a bit of internal knowledge. Network availability feed seems easier to get my head around. I'm also OK continuing on this fork if that's what you'll rather me do, I just feel like I'm going to get it wrong a few times before it's right :)

@jessegrosjean
Copy link
Contributor Author

Oh, maybe you've already seen this, but I just found this Pre-defined ID thread:

https://discord.com/channels/1200006940210757672/1230453235207110666

Same exact problem we are trying to solve here.

@jessegrosjean jessegrosjean marked this pull request as draft May 15, 2024 13:18
@heckj
Copy link
Collaborator

heckj commented May 15, 2024

Sure, I can switch in and update these APIs - I definitely have the internal knowledge. I'd love to have you actively contributing to the project as well, and when I saw the PR I was super-happy to have you poking at it for your needs!

I'm good to proceed either way - if you want to avoid getting too deep in that state machine internals (and the crazy that is the document resolver), I can run with this. There's no obligation to "trade" or anything, so it's really your call. Let me know what would be easier for you, and we can make it work.

The one thing I wasn't sure of were the semantics you were wanting re: find or not - do you want it request from networking in this use-case-scenario, or just look in the local storage, if available?

(and yeah - thanks for the link, I was familiar with the thread. It's been a discussion in the project going back quite a ways)

@jessegrosjean
Copy link
Contributor Author

Sure, I can switch in and update these APIs - I definitely have the internal knowledge

Yeah, I would prefer this, and thanks :) I think it would require me to do a lot of reading before being confident about handle state changes.

There's no obligation to "trade" or anything, so it's really your call. Let me know what would be easier for you, and we can make it work.

Well I won't grab it today, going to try to focus on some work I "should" be doing, but I have my eye on that issue. I think it would be a good first one for me.

The one thing I wasn't sure of were the semantics you were wanting re: find or not - do you want it request from networking in this use-case-scenario, or just look in the local storage, if available?

I think this call should return without waiting on network, but with waiting to read storage (and merge if needed). With that said I'm not really sure about any of the details, but I think we both understand the high level problem in the same way. Externally stored Document/ID pair ... get into repo! Merge if it already exists. Details needed to implement that, I'm not entirely sure.

@heckj
Copy link
Collaborator

heckj commented May 15, 2024

@jessegrosjean - check out #99

@heckj
Copy link
Collaborator

heckj commented May 20, 2024

@jessegrosjean You okay with my closing this PR?

@jessegrosjean
Copy link
Contributor Author

Yes, the new import is what I was after.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants