Reduce memory usage when parsing JSON #15

evestera · 2019-11-11T14:30:12Z

Currently, JSON parsing is done with e.g. serde_json::de::from_reader(File::open(path)?) to a serde_json::Value. This is OK for small and medium-sized JSON documents, but does way too many allocations to be efficient when the JSON gets above a few hundred MBs (not a fault of serde, just the reality of parsing unknown JSON to memory). I see at least two alternatives here:

Rewrite the inference code to work with streaming JSON data, rather than parsing the entire structure into memory at once. (Proof of concept written. Does the trick. Just a big refactoring to do in the actual crate.)
Replace the use of serde_json::Value with a mostly compatible type that uses a string pool, as repeated object keys are a very large part of the memory used.

The text was updated successfully, but these errors were encountered:

evestera · 2021-10-17T16:59:19Z

Fixed by f0be58e

evestera closed this as completed Oct 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce memory usage when parsing JSON #15

Reduce memory usage when parsing JSON #15

evestera commented Nov 11, 2019 •

edited

Loading

evestera commented Oct 17, 2021

Reduce memory usage when parsing JSON #15

Reduce memory usage when parsing JSON #15

Comments

evestera commented Nov 11, 2019 • edited Loading

evestera commented Oct 17, 2021

evestera commented Nov 11, 2019 •

edited

Loading