-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Consider using succinct data structures for read-only trees #107
Comments
Rowan is such nerd catnip for me. https://doi.org/10.1145/2601073
The forest operations seem especially relevant to rust-analyzer/rowan use cases, since the main mutation use case is in grafting subtrees. It's definitely worth looking into... and I'm going to think about this until I experiment some with it... I sense a sorbus 0.2 coming some time in the future 😅 An interesting constraint that Rowan has over pure academic trees is the actual data storage on each node (in a side table for succinct data structures, typically, AIUI). Both interior nodes (for length offsets of children, for sublinear position search) and terminal nodes (for the actual text) want to themselves be dynamically sized, so size analysis is obviously more complicated, along with node deduplication. |
One property of Rowan that I'm fairly sure sure succinct trees would lose is the independence of nodes. Currently, if you have Is this restriction problematic for what rust-analyzer wants? Node payloads would still be able to be cached and deduplicated separately from the trees themselves. If we're okay giving up partial sub-ownership of a green tree, we can reduce memory usage even without a succinct tree by using indices into an arena rather than pointers (as well as unlocking parent pointers in the green tree, if we want those). |
I've recently been reading on succinct data structures literature, and I am wondering if they might be applicable to rowan. The tagline of SD is to represent data using approximately as few bits as entropy allows, but with keeping all the operations fast.
In particular, for trees, we generally use
2 * n * sizeof<usize>
bytes (parent/children pointers). SD allows using roughly2n
bits, while still allowing for efficient parent/child queries. This works due to the following tricks:It would be interesting to see if it makes sense to use something like this for
rowan
.Some reasons to do this:
Some reasons not to do this:
The text was updated successfully, but these errors were encountered: