Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Revert opt outlining change #2778

Merged
merged 1 commit into from
Sep 15, 2021
Merged

Conversation

senhuang42
Copy link
Contributor

#2763 regressed performance of opt by about ~7%.

There are 5 inlined functions in opt. I've re-tested the performance with and without inlining of each one separately.

The only function that can be outlined without any obvious regression on gcc is ZSTD_getMatchPrice(). But even then, it saves a negligible 4KB of binary size and no discernible compile speed difference, and the speed changes are not obviously better.

benched on level 16 silesia.tar, gcc-11
All inlined: 5.90 MB/s
ZSTD_compressBlock_opt_generic outlined: 5.82 MB/s
ZSTD_BtGetAllMatches outlined: 5.79 MB/s
ZSTD_insertBtAndGetAllMatches outlined: 5.78 MB/s
ZSTD_updateTree_internal outlined: 5.82 MB/s
ZSTD_getMatchPrice outlined: 5.86 MB/s (equal performance for level 17-19 though, and on enwik7).

Seeing this, it doesn't seem worth it to have outlined these functions. Cumulatively, the effect of outlining all of them saves a lot on binary size and compile speed, but the speed trade-off is too severe.

@senhuang42 senhuang42 merged commit 9d2a45a into facebook:dev Sep 15, 2021
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants