Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

feat: implement a dirty flag #74

Merged
merged 1 commit into from
Nov 27, 2020
Merged

feat: implement a dirty flag #74

merged 1 commit into from
Nov 27, 2020

Conversation

Stebalien
Copy link
Member

This patch introduces a dirty flag and uses it to avoid unnecessary writes.

  1. It avoids writes when creating new nodes. Instead, it just caches the new nodes and marks them dirty.
  2. It only flushes modified nodes instead of all cached nodes.
  3. Flush no longer clears the cache, just the dirty flags.

This patch introduces a dirty flag and uses it to avoid unnecessary writes.

1. It avoids writes when creating new nodes. Instead, it just caches the new
   nodes and marks them dirty.
2. It only flushes modified nodes instead of all cached nodes.
3. Flush no longer clears the cache, just the dirty flags.
@Stebalien
Copy link
Member Author

So, I was planning on rewriting this the same way I did the AMT, but that wasn't really necessary in this case. Simply fixing the unnecessary writes issue turned out to be much simpler.

This appears to reduce writes by at least 40% in the "fill" benchmarks.

Note: like with AMT v2, the root node will always be written on flush. Otherwise, we only write modified nodes.

Copy link
Contributor

@ZenGround0 ZenGround0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
One question, what is the motivation for no longer clearing caches from memory during flush?
--edit--
I now see the request for this in #72. It looks like it's a memory / time trade off and we care more about time than memory? In the filecoin state machine we don't do much with hamts after flushing so the choice doesn't seem strongly motivated for this use. Is there a motivation I'm missing?

@ZenGround0
Copy link
Contributor

ZenGround0 commented Nov 25, 2020

Oh also this is going to change gas accounting so we either 1) wait for me to release v3 in the near future before merging, or 2) go ahead and update this to v3 before merge yourself.

#76 is now merged so this can be rebased and merged.

@austinabell
Copy link

austinabell commented Nov 26, 2020

LGTM
One question, what is the motivation for no longer clearing caches from memory during flush?
--edit--
I now see the request for this in #72. It looks like it's a memory / time trade off and we care more about time than memory? In the filecoin state machine we don't do much with hamts after flushing so the choice doesn't seem strongly motivated for this use. Is there a motivation I'm missing?

Well, the memory would be dropped right after the structure is dropped (or whenever it's GCed) so I don't really see the difference in this case. It's not like the flushed hamt is kept in memory for any period of time really, and keeping a pointer to the node until it is doesn't seem like it incurs any cost, at the benefit of more generally usable and reusable implementation. Also, the state tree in between messages could reuse cached nodes, instead of reloading them a bunch of times, this seems to be to be the biggest benefit of this.

Also it would be a bit weird if read caches are kept, where dirty caches are dropped. That seems like unexpected behaviour to me

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants