Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Store raw image data as LZ4 compressed blobs #69

Merged
merged 2 commits into from
Apr 7, 2023

Conversation

CosmicHorrorDev
Copy link
Collaborator

Resolves #45

(Well not quite, but it's a considerable improvement for now)

This PR changes the raw image data to be stored as LZ4 Frame Format blobs. LZ4 because it's got very fast compression/decompression speeds and the frame format because lz4_flex's block format stores the output in a Vec allocated for the maximum possible output size which leads to a pretty large memory spike. Granted Linux (and Mac?) typically doesn't allocated zero'd pages until they're actually used for something else, but I figured the lower memory spike was worth the extra code complexity to handle streaming

Demo

demo.mp4

Profiles

main

image

This PR

image

Analysis

Above you can see that with this PR scrolling over all of bun's README is typically a steady ~20 MiB of RAM with two spikes. One towards the beginning from loading the ~55 MiB RGBA8 data before compressing it into an LZ4 blob, and a second towards the end when that blob is decompressed to get rendered

The peak memory usage is a bit lower now since the raw image data generally going to be loaded and dropped at different times. This would cause a much larger difference on a file that has more uniform sizes whereas bun's README has a single outlier

@trimental
Copy link
Collaborator

Hello once again sorry for delays in checking this out. Brilliant work, everything looks smooth on my computer and the compression causes great memory savings 🥳. Just two little nitpicks.

First I think it would make more sense to only log the time of decompression, and to also print the time & data sizes when we compress. Compression takes longer so it's the more important metric to log.

Secondly I think we should preallocate the vector the same way that the lz4_flex crate does it. They use a private module function :(. But it's such a simple function we can just inlyne it to our code. Might save a few vector reallocations on large files.

let mut lz4_enc = FrameEncoder::with_frame_info(frame_info, Vec::with_capacity(16 + 4 + (image.len() as f64 * 1.1) as usize));

@CosmicHorrorDev
Copy link
Collaborator Author

First I think it would make more sense to only log the time of decompression, and to also print the time & data sizes when we compress. Compression takes longer so it's the more important metric to log.

That sounds good!

Secondly I think we should preallocate the vector the same way that the lz4_flex crate does it.

That's the same technique that the lz4 block format uses (aka lz4_flex::block). I opted for streaming and eating a few reallocations since the compressed sizes are so small that it won't really end up resizing too many times anyways (it uses an internal buffer of 256 KiB with the current frame info settings)

We could allocate everything up front and then shrink to fit the compressed size, but that would generally end up allocating like 50x the amount of memory we need in the end. I chatted with the crate author and they said that allocating everything up front was only intended for small sizes (like sub 8 MiB), and to use the frame format when dealing with anything larger


I'll work on tweaking the logs later today. Thanks for the review as always :)

@trimental trimental merged commit 0c53b0d into Inlyne-Project:main Apr 7, 2023
@CosmicHorrorDev CosmicHorrorDev deleted the compress-image-data branch April 6, 2024 04:17
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle streaming decoding of images
2 participants