Skip to content

Latest commit

 

History

History
274 lines (182 loc) · 14.5 KB

APPLICATION.md

File metadata and controls

274 lines (182 loc) · 14.5 KB

Open Grant Proposal: Rough Opal

Name of Project: Opal

Proposal Category: devtools-libraries

Proposer: @tabcat

(Optional) Technical Sponsor: Dietrich Ayala, @autonome

Do you agree to open source all work you do on behalf of this RFP and dual-license under MIT, APACHE3, or GPL licenses?: Yes

Project Description

Opal is a peer-to-peer, local-first database. Its focus will be on providing web applications with dynamic and collaborative states. The core technologies used are:

  • IPLD - data reference
  • Merkle-CRDTs - the replica data structure
  • IPFS, IPNS, and Libp2p - update advertisement and replication
  • IPFS and Filecoin - backup and reliable hosting of replicas

The project is written in typescript and compiled into javascript. It will be robust, maintainable, and easy for developers to use. This project continues work done under a grant for OrbitDB they did not accept. Opal is not a fork of OrbitDB; it is a complete rewrite focused on efficiently representing arbitrary states. Opal will not be interoperable with OrbitDB.

Merkle-CRDTs are still at the heart of the project. This data-structure is a combination of Merkle-DAGs and CRDTs. It provides causal order and de-duplication of operations, and it ensures strong eventual consistency.

The project will provide these two abilities as part of this grant:

Developers will be able to model custom states, similar to using Redux reducers. They can supply a way to reduce and read a state. These states are computed by reading the replica's causal log of operations. This log can be appended online or offline and then merged/synced at a later point.

Developers will be able to pin user replicas to reliable storage backends. Each device can pin its replicas as CAR files and update its IPNS record. These allow peers to replicate with nodes that have gone offline. Data persistence has been an issue for peer-to-peer databases. Persistent replication is a big step forward.

These two abilities make it possible to create compelling, edge-computed apps.

Value

The pattern described previously combines Merkle-CRDTs with reliable hosting of the CRDT replicas. It centers around edge computers processing and updating data. More general machines keep the data available. Applications this pattern is best-suited fall into media or communication, like most of Google's app suite.

Building software this way has unique characteristics and goes hand in hand with delay-tolerant network designs like IPFS. The user has control over their data with the choice to self-host. The local replica is the source of truth, referred to as local-first.

With Opal, developers can define the state reducer for a database. A state reducer processes the log into a readable state. Collaborative UI components can use Opal to derive their state. This collaboration can be between users and a user's devices.

A big issue with peer-to-peer databases like OrbitDB is that if no other devices are online, you can't replicate anything! Persistent replication is needed, and there are two ways to do it.

The first is continuing the pinning service idea, where you have Opal instances to keep online with a pin-list of databases to replicate. These servers run replication algorithms that work over pubsub and IPFS. If those pinners are online, then the data is available and can be replicated. This solution is not terrible as it has some benefits, but it's better described as 'persisted replication' since pubsub messages are not persistent.

The second solution for persistent replication has to do with swapping pubsub for IPNS. Instead of a node advertising the latest known heads over pubsub, IPNS becomes used in place. Then the IPNS records and IPFS data are pinned. Another advantage is that IPFS and IPNS are more general layers and don't require building specialized infrastructure and support.

architecture diagram

Keeping Merkle-CRDT replicas available allows applications to retain functionality when peers are not online.


Because the work done for OrbitDB under a previous grant was not accepted, doing this with OrbitDB would be very difficult due to some tight coupling. Opal is much more modular when compared, especially with replication.

Opal also includes incremental traversal of the Merkle-DAG in either direction. This type of traversal is not part of OrbitDB and is the most significant change from that previous grant work. Incremental traversal allows for database entries to be kept out of memory and streamed when needed by traversing a graph of CIDs.

The most challenging part of this grant will be building persistent replication. It involves uploading the replica to pinning services as it's updated. It's new and will use CAR files and involve updating IPNS records.

Deliverables

Opal and Zzzync are the deliverables. Opal is the Merkle-CRDT collaborative states piece, and Zzzync will be a replicator module for persistent replication.

The features to be delivered for each are in the following issues:

Opal Base Feature Set

Zzzync Development Plans

Opal-Spec 1.0-beta

There will also be a monthly status issue in the Opal repo. The monthly issues will track what is being worked on and completed. Here is September's Status.

Deliverables will also be tracked in tabcat/rough-opal's README; a repo made for this grant.

Development Roadmap

(SEPT 2022) Opal Repo Init

  • begin opal design and spec
  • configure project repo
  • write unit tests
  • prep for adding features
    • build interfaces for manifest modules
    • rewrite manifest module registry
    • rework store module
    • rework classes to use Libp2p's startable interface
  • make databases locally persistent

(OCT 2022) Opal Replication and Perf

  • begin design and spec of live replicator
  • add live replicator (Libp2p pubsub + IPFS)
  • test replication and replicated states
  • write benchmarks
  • automate release with generated API docs and changelog
  • release draft spec for Opal and main modules
  • release alpha with expected public API changes

(NOV 2022) Zzzync Replicator

  • begin design and spec of persistent replicator
  • choose a design to build
  • configure project repo
  • write implementation (likely using web3.storage and w3name)
  • write unit tests
  • test interop with Opal
  • write benchmarks
  • automate release with generated API docs and changelog
  • release draft spec
  • release alpha with expected public API changes

(DEC 2022) Release

  • Heavy Testing
    • network simulated testing with testground
    • stress-test and benchmark replicators
    • check for replication bugs and perf improvements
  • Usage References
    • Opal and Zzzync automated (and nice-looking) API docs
    • Write base FAQ.md document for common user questions
    • Basic Tutorial document added to repo or blogged
    • NodeJS and Create React App examples
  • Release Opal and Zzzync 1.0-beta
    • completed protocol specification
    • typescript implementation (with public API locked until 1.0)

Total Budget Requested

Budget Duration Payable As
$30000 SEPT-DEC 2022 FILECOIN

1 Full-time Engineer over 4 months at 45$/hr

Payments preferred monthly in Filecoin

Maintenance and Upgrade Plans

This grant will build a foundation that will define the base features and keep the project hyper-maintainable over the years. After the project reaches this level of maintainability, the key is cultivating a user base. Acquiring users will require exposure while providing documentation, a helpful community chat, and a valuable tool with great developer experience.

Following release there will still be room for improvement:

Nearer future (~2.0):

  • active replication: implementing the replication algorithm described in Byzantine Eventual Consistency; involves pushing data missed by a peer's bloom filter. useful for applications that want less latency, like messaging.
  • encrypted Merkle-CRDT: using a group encryption algorithm like Key Agreement for Decentralized Secure Group Messaging, which should fit quite nicely.
  • dynamic access control: update access control lists without affecting operation history
  • efficient predecessor referencing: allow quicker traversal and replication of the Merkle-DAG by picking references smartly, thanks to science.
  • graphsync replicator: using graphsync to improve replicator performance.

Further future (~3.0):

  • dynamic topological sort: maintaining a topological sort of the DAG as entries get merged. not sure if it is possible to do it deterministically and will need to revisit. A Dynamic Topological Sort Algorithm for Directed Acyclic Graphs.
  • finality gadgets: this would look at the best ways to migrate databases without too many side effects.
  • CBOR CRDT: a CBOR state where each field in the CBOR object is fully mergeable.
  • cross-log causality: use chained randomness beacons like drand to provide a universal causality. seems useful for some applications?

Team

Team Members

Daniel, @tabcat

Team Website

https://github.com/cypsela

Relevant Experience

Daniel was involved with OrbitDB since finding it in late 2018, shortly after diving into IPFS.

In March 2020, I started working on a collaborative filesystem on top of IPFS using OrbitDB. In July 2020, a partner and I leveraged this to build sailplane, a p2p Dropbox-like web app that made us finalists in the first HackFS.

In February 2021, I was contracted by equilibrium.co to maintain OrbitDB.

In November, we signed an open source grant from Protocol Labs to fund the first part of development for OrbitDB 1.0. During those six months, I worked on keeping the current version supporting the latest js and go IPFS versions, and a protocol spec and implementation for 1.0.

That work ended up not being accepted by the owners of OrbitDB, so I'm continuing it with Opal.

In July 2022, I began thinking more about persistent replication methods for OrbitDB involving IPFS and IPNS pinning. While on a trip during HackFS 2022, I learned about web3.storage and w3name. After the trip, I hacked them together for a short demo project and won the IPFS/Filecoin first prize.

Team code repositories

Daniel:

  • zzzync: 2 day hack/concept using web3.storage, first prize at hackfs2022 [ref1]
  • sailplane: collaborative filesystem web app built with orbitdb and ipfs, finalist at hackfs2020 [ref1] [ref2]
  • orbit-db-fsstore: a custom orbitdb database representing a filesystem
  • orbitdb: community maintainer and former full-time maintainer

Additional Information

How did you learn about the Open Grants Program?

from previously working on an open source grant from protocol labs for orbitdb

Please provide the best email address for discussing the grant agreement and general next steps.

tabcat00@proton.me


For now, Opal is a temporary/code name until I find something better. (which may be never)

Always open to hearing naming ideas! 😁