Skip to content
This repository has been archived by the owner on Oct 11, 2024. It is now read-only.

Commit

Permalink
tidy up design
Browse files Browse the repository at this point in the history
  • Loading branch information
gdbelvin committed Mar 6, 2020
1 parent 3ca8f89 commit 78b2381
Showing 1 changed file with 54 additions and 47 deletions.
101 changes: 54 additions & 47 deletions docs/design2.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,44 +2,51 @@

## Objective

Key Transparency currently suffers from a design-space tradeoff in two dimensions between
speed and verification bandwidth. Each snapshot of the key-value Merkle tree creates an additional
point in time that clients must audit when inspecting their account history.
Using Key Transparency in this mode requires the presence of powerful auditors who can verify
that no information has been lost between all snapshots.

| Update Speed | Trust, then Verifiy | Verify First |
|--------------|---------------------|--------------|
| hr | Tractable Bandwidth | Unusable |
| min | | Intractable |
| s | | Intractable |


Key Transparency 2.0 is a new set of algorithms that resolves this fundamental tradeoff by sharing
the task of auditing between pairs of clients. As a result, all configurations result in efficiently
auditable data structures, without the presence of 3rd parties.

We expect that many applications will want to prioritize speed and reliability by using a trust, then verify mode, but because the client operations are very similar, users could have the option to wait a bit longer in order to have high confidence that they are using data from a snapshot that is widely available.

| Update Speed | Trust, then Verifiy | Verify First |
|--------------|---------------------|--------------|
| hr | Tractable Bandwidth | Unusable |
| min | Tractable Bandwidth | Opt-in |
| s | | Tractable |
Key Transparency currently suffers from a design-space trade-off in two dimensions between
low latency key updates and verification bandwidth; each snapshot of the
key-value Merkle tree creates an additional point in time that clients must
audit when inspecting their account history.
Using Key Transparency v0.4.0 and below requires the presence of powerful
auditors who can verify that no information has been lost between all snapshots.

| Update Latency | History Audit | Key Lookup |
|----------------|------------------|------------|
| hr | 45Kb/day | ~2Kb |
| min | 2Mb/day | ~2Kb |
| s | 150Mb/day | ~2Kb |


Key Transparency 2.0 is a new set of algorithms that removes this fundamental trade-off by sharing
the task of auditing between pairs of clients. As a result, all configurations have efficiently
auditable data structures that no not require the presence of 3rd party auditors.

We expect that many applications will want to prioritize lower latency and
reliability by using a *trust, then verify* mode whereby clients receive key
updates that they can later verify as being part of the global, consistent
state. However, this does not preclude individual clients that have a lower
tolerance for risk from waiting for full, globally consistent verification
before using fresh public keys from their peers.

| Update Latency | History Audit | Key Lookup |
|----------------|------------------|------------|
| hr | 18Kb O(1) | ~2Kb |
| min | 30Kb O(1) | ~2Kb |
| s | 40Kb O(1) | ~2Kb |


## Data Structures

### Gossip Network
The job of the gossip network is to ensure that there is a single, globally consistent lineage of log roots.
The job of the gossip network is to ensure that there is a single, globally consistent, lineage of log roots.
To keep ![n^2](https://render.githubusercontent.com/render/math?math=n%5E2) communication costs low, and to prevent sybil attacks, clients use a small ~20 set of gossip nodes.

Each gossip node fetches the latest signed log root (SLRs) and verifies consistency with all previously seen roots.
Each gossip node fetches the latest signed log root (SLR) and verifies
consistency with all previously seen roots with a single consistency proof.
After verifying, the gossip node signs the log root.

Gossip nodes offer the following APIs:

1. Get Signed Root
1. Get Signed Root

### Log of Map Roots

Expand All @@ -50,53 +57,53 @@ The log server contains a list of map roots representing sequential snapshots of
Log servers offer the following APIs:

1. Latest Signed Log Root
1. Get consistency proof between log size a and b.
1. Get consistency proof between log size a and b.
1. Get log item i with inclusion proof to log size a.

### Map Root Snapshots
### Map Root Snapshots
Each Map Root represents a snapshot of a key-value dictionary.
The map is implemented as a sparse merkle tree of fixed depth. This should be changed to a patricia-prefix-tri for better efficiency.
The map is implemented as a sparse Merkle tree of fixed depth.

* Indexes in the map are randomized and privacy protected via a Verifiable Random Function.
* Values in the map represent the full history of values that have ever been stored at this index. This is accomplished by storing the merkle root of a mini-log of these values.
* Values in the map represent the full history of values that have ever been stored at this index. This is accomplished by storing the Merkle root of a mini-log of these values.

The map offers the following APIs:

1. Get map value at index j with inclusion proof to snapshot a.

### Value History Log
These mini logs store not just the latest value, but also store every previous value in order.

These mini logs are what users query when looking up their own key history, and they are what their peers verify in order to ensure that no history has been lost.
They are what users query when looking up their own key history, and they are what their peers verify in order to ensure that no history has been lost.

The value log offers the following API:

1. Get latest value at snapshot z with inclusion proof.
2. Get consistency proof between snapshot roots y and z.
3. Get range of historical values between snaptots y and z.
3. Get range of historical values between snapshots y and z.

## Client Verification
## Client Verification

Key Transparency clients store
Key Transparency clients store:
1. The root of the log of map roots. This ensures that the client is using the same snapshots as the rest of the world.
1. The root of every mini-log they have queried. This ensures that snapshots represent append-only representations of the world.
1. The root of every (proven consistent) mini-log they have queried. This ensures that snapshots represent append-only representations of the world.

When querying, Key Transparency clients ask for proof that the current snapshot and value are consistent with previous values that the client has observed.

## Efficiency Improvements
## Efficiency Innovations

Generating append-only proofs for large sparse merkle trees is not efficient. For N changes per snapshot in a map of size M over T snapshots is roughly O(T * N log M)
1. Instead of verifying that the entire map is an append-only operation from previous values, we isolate the work of verification to individual sub-trees. O (T * 1 log M)
1. Rather than verifying every single snapshot, we only verify the snapshots that the sender and receiver used. O(1 * 1 log M)
1. But the snapshots that the sender and reciever used are unknown, so we use a meet-in-the-middle algorithm to sample log T of them. O( log T * log M)
Generating append-only proofs for large sparse Merkle trees is not efficient.
An append-only proof between two snapshots containing `N` changes per snapshot in a map of size `M` over `T` snapshots contains roughly `O(T * N log M)` hashes.
1. Instead of verifying that the entire map is an append-only operation from previous values, we isolate the work of verification to individual sub-trees. `O(T * 1 log M)`
1. Rather than verifying every single snapshot, we only verify the snapshots that the sender and receiver used. `O(1 * 1 log M)`
1. But the snapshots that the sender and receiver used are unknown, so we use a meet-in-the-middle algorithm to sample log T of them. `O( log T * log M)`


# Work Plan

1. Can Trillian Logs be used as mini logs?
1. Can we get fast sequencing without master election?
1. Can we cap the maximum number of elements? Do we need to?
1. Store updates to mini-logs
1. Write a batching algorithm to accumulate updates accross many mini-logs.
1. Migrate to use Trillian Logs mini logs for user updates
1. Switch from `int64` treeIDs to `[]byte` to support billions of users.
1. Sequence and sign mini-logs synchronously rather than relying on a separate process.
1. Write a batching algorithm to accumulate updates across many mini-logs.
1. Write new mini-logs roots to the map instead of the current, direct value approach.
1. Update client verification algorithms.

0 comments on commit 78b2381

Please # to comment.