Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Improve zstd_opt build speed and size #2898

Merged
merged 1 commit into from
Dec 3, 2021
Merged

Conversation

terrelln
Copy link
Contributor

@terrelln terrelln commented Dec 2, 2021

Use the same trick as we did for zstd_lazy in PR #2828:

  • Create one search function specialization for each (dictMode, mls).
  • Select the search function pointer at the top of the match finder.

Additionally, we no longer inline ZSTD_compressBlock_opt_generic into
every function, since dictMode is no longer used as a template. Create
two specializations, for opt levels 0 and 2, and call one of the two
specializations.

Lastly, remove the hack that disabled inlining for zstd_opt for the
Linux Kernel, as we've gotten most of the benefit already.

Compilation time sees a ~4x reduction:

Compiler Flags Dev Time (s) PR Time (s) Delta
gcc -O3 10.1 2.3 -77%
gcc -O3 -fsanitize=address,undefined 61.1 10.2 -83%
clang -O3 9.0 2.1 -76%
clang -O3 -fsanitize=address,undefined 33.5 5.1 -84%

Build size is reduced by 150KB - 200KB:

Compiler Dev libzstd.a Size (B) PR libzstd.a Size (B) Delta
gcc 1327476 1177108 -11%
clang 1378324 1167780 -15%

There is a <2% speed loss in all cases:

Compiler Level Dev Speed (MB/s) PR Speed (MB/s) Delta
gcc 16 4.78 4.72 -1.25%
gcc 17 3.49 3.46 -0.85%
gcc 18 2.92 2.86 -2.04%
gcc 19 2.61 2.61 0.00%
clang 16 4.69 4.80 2.34%
clang 17 3.53 3.49 -1.13%
clang 18 2.86 2.85 -0.34%
clang 19 2.61 2.61 0.00%

Fixes Issue #2862.

@Cyan4973
Copy link
Contributor

Cyan4973 commented Dec 2, 2021

Great work ! Nice build time and binary size savings !

Use the same trick as we did for zstd_lazy in PR facebook#2828:
* Create one search function specialization for each (dictMode, mls).
* Select the search function pointer at the top of the match finder.

Additionally, we no longer inline `ZSTD_compressBlock_opt_generic` into
every function, since `dictMode` is no longer used as a template. Create
two specializations, for opt levels 0 and 2, and call one of the two
specializations.

Lastly, remove the hack that disabled inlining for zstd_opt for the
Linux Kernel, as we've gotten most of the benefit already.

Compilation time sees a ~4x reduction:

| Compiler | Flags                            | Dev Time (s) | PR Time (s) | Delta |
|----------|----------------------------------|--------------|-------------|-------|
| gcc      | -O3                              |         10.1 |         2.3 |  -77% |
| gcc      | -O3 -fsanitize=address,undefined |         61.1 |        10.2 |  -83% |
| clang    | -O3                              |          9.0 |         2.1 |  -76% |
| clang    | -O3 -fsanitize=address,undefined |         33.5 |         5.1 |  -84% |

Build size is reduced by 150KB - 200KB:

| Compiler | Dev libzstd.a Size (B) | PR libzstd.a Size (B) | Delta |
|----------|------------------------|-----------------------|-------|
| gcc      |                1327476 |               1177108 |  -11% |
| clang    |                1378324 |               1167780 |  -15% |

There is a <2% speed loss in all cases:

| Compiler | Level | Dev Speed (MB/s) | PR Speed (MB/s) | Delta  |
|----------|-------|------------------|-----------------|--------|
| gcc      |    16 |             4.78 |            4.72 | -1.25% |
| gcc      |    17 |             3.49 |            3.46 | -0.85% |
| gcc      |    18 |             2.92 |            2.86 | -2.04% |
| gcc      |    19 |             2.61 |            2.61 |  0.00% |
| clang    |    16 |             4.69 |            4.80 |  2.34% |
| clang    |    17 |             3.53 |            3.49 | -1.13% |
| clang    |    18 |             2.86 |            2.85 | -0.34% |
| clang    |    19 |             2.61 |            2.61 |  0.00% |

Fixes Issue facebook#2862.
@terrelln terrelln merged commit 014bbb2 into facebook:dev Dec 3, 2021
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants