Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Reduce memory usage when parsing JSON #15

Closed
evestera opened this issue Nov 11, 2019 · 1 comment
Closed

Reduce memory usage when parsing JSON #15

evestera opened this issue Nov 11, 2019 · 1 comment

Comments

@evestera
Copy link
Owner

evestera commented Nov 11, 2019

Currently, JSON parsing is done with e.g. serde_json::de::from_reader(File::open(path)?) to a serde_json::Value. This is OK for small and medium-sized JSON documents, but does way too many allocations to be efficient when the JSON gets above a few hundred MBs (not a fault of serde, just the reality of parsing unknown JSON to memory). I see at least two alternatives here:

  • Rewrite the inference code to work with streaming JSON data, rather than parsing the entire structure into memory at once. (Proof of concept written. Does the trick. Just a big refactoring to do in the actual crate.)
  • Replace the use of serde_json::Value with a mostly compatible type that uses a string pool, as repeated object keys are a very large part of the memory used.
@evestera
Copy link
Owner Author

Fixed by f0be58e

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant