Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Exposing archived versions as content-addressable hashes (i.e., content-addressable snapshots) #985

Open
1 of 4 tasks
okdistribute opened this issue Apr 27, 2018 · 2 comments

Comments

@okdistribute
Copy link
Collaborator

okdistribute commented Apr 27, 2018

In a public archiving and publication case (e.g., a static PDF), you might want to reference a version of the data by a content-addressable hash. This way, you can guarantee that even if someone lost the original dat keys, they can get that data if it is still on the network, and anyone can regenerate the hash consistently and re-share it if they have the content of the archive.

This would also be an interesting approach to use for implementing a CDN-style interface.

Security note: perhaps there's also a way to hide the original content-addressable hash from the network to protect reader privacy, in a similar way we are using discovery keys today. Because this method would likely be reserved for archives that are deemed public and static, it might be nice to decouple this archive from the original dat keys that generated it, to preserve the privacy of the original dat in case that's desirable.

Feature ideas:

  • a method to get the content-addressable hash of an archive
  • swarm broadcasts that hash on the network
  • when listening for a 'live' dat key, be able to find these static versions from the swarm if anyone has them

I am reporting:

  • a bug or unexpected behavior
  • general feedback
  • feature request
  • security issue
@pfrazee
Copy link

pfrazee commented Apr 27, 2018

A couple of thoughts I've had on this:

  • It used to be possible to create a dat archive which is content-addressed, and that would be a separate archive. You basically get all the files ahead of time, write them, and then a resulting hash becomes the key (I dont know what that hash was of, details details). One method could be to bring that function back, and then a "snapshot" would basically be a totally separate archive that's hash-addressed, constructed from the state of the dynamic dat.
  • If needed, we can create identifiers that look like dat+snapshot://{key}/. Basically we can create addressing/protocol variants using the + in the scheme.

@okdistribute
Copy link
Collaborator Author

@pfrazee nice ideas!

Bret also mentioned that people were musing about this but wanted to wait for implementation until multiwriter/hyperdb lands on mainline, which makes sense.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants