-
Notifications
You must be signed in to change notification settings - Fork 644
Memory consumption explodes with large number of nodes. 40MB of data becomes ~16GB in MST #1683
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Comments
Can't seem to really make a codesandbox for this, because codesandbox stops functioning around 10k nodes |
@techknowfile Does the problem show at 10k nodes, or only when you get into higher numbers? Maybe it's enough to demonstrate the issue with fewer nodes? |
@stewartlord CodeSandbox failed before I could show anything meaningful, so I made a minimal reproduction in a github repo, instead. https://github.com/techknowfile/mobx-vs-mst (requires using Chrome as browser) I decided to try using MobX instead of Mobx-State-Tree, but first needed to confirm that I wasn't going to have the same issue as MST. I noted that: Moreover, I can add about 2.9 MILLION MobX objects before crashing the browser, and can add them very quickly. I was honestly surprised to see that even MobX has a 10x memory footprint over plain javascript. |
Thanks for sharing your numbers @techknowfile! I don't know enough about MobX or MST internals to speculate as to why, but it's helpful just to know the cost of having many entries and that MobX is more efficient at this than MST. Maybe one of the maintainers knows where it is being spent? |
@techknowfile very interesting. About the mobx-state-tree part, I believe you misunderstand the usage of create. https://github.com/techknowfile/mobx-vs-mst/blob/main/src/mst-models/RootStore.js#L44 addMetricItem(json) {
//let item = MetricItem.create({ id: json.id, value: json.value});
self.metricItems.set(json.id, { id: json.id, value: json.value});
}, You can try this. Create is only use for create root model, if you create it every time, it will not leverage the ability of share structure of MST. |
@HaveF is this documented anywhere? Can you elaborate on the ‘share structure’? |
@HaveF Thank you, but that doesn't change anything in terms of this issue specifically (I was only using that in the sandbox to reproduce, but in my actual application mst-gql is handling the creation) |
@stewartlord I forget where to see it. I remember I saw the similar word @techknowfile Maybe you can use function like getSnapshot? |
@HaveF I can't use mst-gql, as it will automatically load the data into the mst and therefore explode the memory |
I just view the doc of mst-gql. Maybe you need to set the right |
I don't think this is true. You can see that mst-gql's merge function calls .create() for every model instance |
Besides your 10k nodes repo, do you have any other I don't know how mst-gql's merge works, but it seems a method which is only used in mst-gql with other graphql clients. |
I experience the same problem, in fact, I even prepared a small codesandbox example https://p968x.csb.app/. |
@weglov I'm currently having to use a hacky solution, where I'm using both pure MobX with a structure that imitates mobx-state-tree and mobx-state-tree itself. Using MobX for the objects that I have 100,000's of, and mst/mst-gql for other parts. I replicated the mst-gql merge function in my MobX root store to be able to parse json responses into the normalized store, and then resolve references that go from MobX objects to mobx-state-tree nodes. I find that even the 10x increase in footprint when loading into mobx is also a bit excessive, and has me wondering if I should ditch these libraries entirely and roll my own... but I love how they work. And the more I dig into the internals, the more impressed I become. |
There is a problem with memory consumption in MST. I did some research in the past.
Making all the refactorings mentioned above will also lead to great It's hard to judge without measurements, but I assume that we could reduce MST memory footprint to about x1.2-x1.5 of pure mobx - which seems like a reasonable tradeoff for the benefits. Unfortunatelly, the amount of work to be done is quite huge, coupled and spreads over ~60-80% of the codebase. There is also a risk, that performance gains will be not as positivie as assumed. Additionally, even if everything goes as planned, there are several breaking changes.
I'd prefer 3.a (it has a drawback of poor node "exports" support by tooling atm) or 3.b - but it'll lead to significant package size increase for the period of both new and old code co-existance (which is not a concern for a treeshaking-aware developers). And can also cause some misunderstanding about Would be great to hear @jamonholmgren thoughts. |
Thank you so much for the thorough response, @k-g-a! When I realized the memory issue initially, I first created a hacky class-based solution for MobX that replicated the normalized RootStore structure of mst-gql for the object types that I had a lot of (hundreds of thousands), and then hackily combined my mst root store to my MobXRootStore. This was done primarily to confirm that pure MobX would solve our problem. As stated above, it did of course resolve the memory issue. However, I also need to run my MobX data structure on the backend, and needed to get rid of this MST + MobX Frankenstein. Really liking the way that MST built "types" instead of classes (in part because of some issues I encountered with class inheritance), I then created an oversimplified javascript implementation of mst's ComplexType builder (don't know what you actually call this design pattern), where I'm then leveraging the mst-gql-inspired .merge() function to hydrate my MobX Root Store. So, sans-TS, sans-lifecycle, sans-identifier cache, sans-snapshot support... just instances of ComplexType containing {properties, initializers} that are used to define an "ObjectNode"... the memory explosion issue is back! And it has me thinking that maybe one of the major problems with MST has been somehow replicated into my code, which is a tiny subset of what MST is. I'd love to upload a git repo example or codesandbox with my implementation if you'd be interested in glancing through it to see where the memory could possibly be coming from (I am using the "self" closure solution to avoid binding "this"...). I also have a ton of questions regarding how MST and MobX work that I've encountered while trying to reverse engineer it, and would be very thankful to speak to one of the few people that seem to have a good grasp on the internals and underlying theory. I'd be interested in trying to go the self-less route (in the off chance that the memory is actually worse than 20%), though frankly I didn't realize that .prototype even existed for per-type object nodes... though my understanding of javascript isn't super strong |
Yep, git repo example would be great to look at. |
Repo: https://github.com/techknowfile/mobx-cube-example Here's a simple example using my MobX store implementation. If you You'll notice that my implementation of props is naive and a little broken. I'm using a props initializer which first creates empty props and makes them observable, and then as part of the ObjectNode constructor the "snapshot" is parsed into the properties. The whole thing is a little silly rn, but I couldn't find where mobx-state-tree was setting the properties to the ObjectNode instances. Anyway, instead of the data being loadable via store.create(SNAPSHOT), all data is loaded into the RootStore using store.merge(json). Edit: I also added a "Bar" type that doesn't have any actions or views. The memory footprint is smaller, but still much larger than pure mobx. |
@techknowfile , thanks for the repo, it's a great starting point for the investigation! |
@k-g-a Done! That repo now has very simple models implemented with (1) my mst-like solution, and (2) a class-based mobx solution. If you add views/actions to them, you'll notice the memory delta becomes even more significant. Another thing to note is how very fast it is to add models to the class-based store, and how slow it is to add to the mst-like store (exactly the same issues we faced when comparing mst itself to this class-based solution in the previously posted repo). |
It sounds like best option. For MST-users migration looks like change default
|
Is there some code that I can adapt in user land? Currently creating thousands of nodes, e.g. imagine copying and pasting a spreadsheet, and thinking of performant solutions for this since I keep all nodes in a map with normalized ids. |
Hey folks - I don't have enough context here to help move this issue forward, but I am going to tag it, since there's obviously desire and room for some major improvements, but the path forward looks somewhat challenging. I don't expect we'll get around to this any time soon, but I do hope we can find a satisfactory path forward in the coming months/year. |
Hey! I encountered the same issue where I need to combine |
Bug report
Memory consumption explodes with large number of nodes in MST.
2MB of data -> ~600MB in MST
40MB of data -> ~16GB in MST
Sandbox link or minimal reproduction code
Github Repo
CodeSandbox fails long before being able to show anything meaningful. Clone repo > yarn install > yarn start
Describe the expected behavior
I'm not sure what the expected overhead of mobx-state-tree is. I've found several posts that mention that mobx-state-tree was never intended to handle hundreds of thousands of nodes, and also see that a couple developers have improved upon the performance for their own use cases. Even still, I'm surprised that 40MB of data could possibly take up 16GB when processed into MST.
Describe the observed behavior
MST able to scale to hold hundreds of thousands of nodes without what appears like a memory leak
I'll continue to update this bug report with a reproduction of the issue over the next few days, but wanted to get this posted to get the discussion started.
The text was updated successfully, but these errors were encountered: