Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

EPIC: Node & key format #548

Closed
3 tasks done
tac0turtle opened this issue Sep 7, 2022 · 8 comments
Closed
3 tasks done

EPIC: Node & key format #548

tac0turtle opened this issue Sep 7, 2022 · 8 comments
Assignees
Labels

Comments

@tac0turtle
Copy link
Member

tac0turtle commented Sep 7, 2022

Currently IAVL does not take advantage of data locality on disk. The way IAVL store nodes on disk is with a hash that gets stored in a random location on disk and makes it so that IAVL needs to do a random search of disk in order to find the data. With the recent work from osmosis on fast node we saw a large performance improvement with a trade off on writes.

Secondly, there are two key formats, live keys and orphan keys referenced by node hash which also indexes the orphan nodes.

A next step would be to modify IAVL to structure data on disk in a logical manner to reduce random searches and reduce random compaction of disk data.

There are two things that can be done in order to take advantage of this.

Work Scope:

Phase 1:

  • modify the key format from a hash of the data to be write_version|node_path_in_tree.

Phase 2:

  • make every node store the version of its children.
  • remove orphan indexing and hashes

(Proposed by @ValarDragon)

@tac0turtle tac0turtle changed the title Node & key format EPIC: Node & key format Sep 7, 2022
@ValarDragon
Copy link
Contributor

ValarDragon commented Sep 7, 2022

Other steps:

  • To modify the key format, you have to update the node struct to add left_child_version and right_child_version in the nodes
  • Probably delete the root key format, and instead just make it a method that returns a key in your main node format space

Can be done later, but needed pre-integration for pruning:

  • Clarify versioning / historic delete API semantics from SDK
  • Make methods to prune (theres a couple approaches, if someones having trouble deriving them, can write up more)

@faddat
Copy link
Contributor

faddat commented Sep 9, 2022

modify the key format from a hash of the data to be write_version|node_path_in_tree.

where write_version is the left side and node_path_in_tree is the right side?

Or is it like, both sides have that?

@faddat faddat mentioned this issue Sep 9, 2022
@robert-zaremba
Copy link
Collaborator

NOTE: store/v2 solves that in a way similar to TurboGeth (erigon)

@robert-zaremba
Copy link
Collaborator

So this proposal will increase a node (and DB) size, right?

Few important questions

  • What is the definition of write_version?
  • Will that impact state proofs / merkle proofs? This change must be compatible with the existing merkle proof, otherwise we have trouble with IBC. My guess is that it won't, but want to double check.

@tac0turtle
Copy link
Member Author

NOTE: store/v2 solves that in a way similar to TurboGeth (erigon)

thank you, but as mentioned in a few places store/v2 work is unknown at this time for countless reasons.

@cool-develope
Copy link
Collaborator

cool-develope commented Sep 27, 2022

@marbar3778
It looks like we couldn't modify the key format as write_version|node_path_in_tree. Because when rotating the tree, the path would be changed.

@tac0turtle
Copy link
Member Author

this is expected, do you see a problem with this?

cc @ValarDragon

@catShaark
Copy link
Contributor

@marbar3778, I'm also working on this, can you assign me as well

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants
@robert-zaremba @ValarDragon @faddat @tac0turtle @catShaark @cool-develope and others