Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

high allocation ratio #47

Open
yuvalgut opened this issue Apr 10, 2022 · 3 comments
Open

high allocation ratio #47

yuvalgut opened this issue Apr 10, 2022 · 3 comments

Comments

@yuvalgut
Copy link

I have a processing scenario where I read lzma objects and I need to decompress them.
while using pprof I could see that the lzma reader allocates buffers for every message:
0 0% 0.0046% 433804685.70MB 96.50% github.com/ulikunitz/xz/lzma.NewReader (inline)
4122.21MB 0.00092% 0.0056% 433804685.70MB 96.50% github.com/ulikunitz/xz/lzma.ReaderConfig.NewReader
2414.61MB 0.00054% 0.0061% 432805222.15MB 96.28% github.com/ulikunitz/xz/lzma.newDecoderDict (inline)
432802807.54MB 96.28% 96.28% 432802807.54MB 96.28% github.com/ulikunitz/xz/lzma.newBuffer (inline)

can we add some option for allowing to have a pool of that buffer? or some other way to reuse a reader?

@ulikunitz
Copy link
Owner

Why is that a problem? The buffer is allocated once per LZMA object and collected by the GC. You can control the size of the buffer, while creating the LZMA object.

@yuvalgut
Copy link
Author

Hi thanks for the response!
from the reader simple test:
r, err := NewReader(xz) if err != nil { t.Fatalf("NewReader error %s", err) } var buf bytes.Buffer if _, err = io.Copy(&buf, r); err != nil { t.Fatalf("io.Copy error %s", err) }
when r, err := NewReader(xz) is called the dict buffer gets allocated.
then we call io.Copy(&buf, r) which reads the uncompressed data into the 'client' buffer.
so now we have the dict buffer already allocated - we could have used it in order to decompress another lzma data but there is no 'reset' option, so we have to recreate a reader with NewReader(xz) which will allocate another dict buffer instead of using the one we already allocated and used.

let me know if that makes sense
thanks again

@ulikunitz
Copy link
Owner

I'm currently reworking the LZMA package to support parallel & faster compression and faster decompression. I will look into Reset options.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants