Stagger Stepping in Negative Levels #2921

felixhandte · 2021-12-10T21:48:33Z

#2749 returned the negative levels to pre-#1562 behavior. I.e., skipping stepSize positions every iteration. This PR emulates #1562-like stepping by applying the stepSize skip every other position.

This PR addresses #2827.

Benchmarks

silesia.tar on gcc-10:

silesia.tar on clang-13:

Status

I believe this PR is ready for merge.

@terrelln

This replicates the behavior of @terrelln's `ZSTD_fast` implementation. That is, it always looks at adjacent pairs of positions, and only applies the acceleration every other position. This produces a more fine-grained acceleration.

This avoids an additional addition, at the cost of an additional variable.

felixhandte · 2021-12-10T21:53:25Z

Old Benchmark

`silesia.tar` on GCC-10:

The position updates are rewritten from `ip[N] = ip[N-1] + step` to be `ip[N] = ip[N-2] + step`. This lets us only deal with the asymmetric spacing of gaps at setup and then we only have to keep a single `step` variable. This seems to work quite well on GCC and Clang!

felixhandte · 2021-12-13T20:29:30Z

Old Benchmark

The new version shows the following performance:

gcc-10:

clang-13:

The templating does at ~10KB to the library size.

I couldn't find a good way to spread `ip0` and `ip1` apart when we accelerate due to incompressible inputs. (The methods I tried slowed things down quite a bit.) Since we aren't splaying ip0 and ip1 apart (which would be like `0_1_2_3_`, as opposed to the `01__23__` we were actually doing), it's a big ambitious to increment `step` by 2. Instead, let's increment it by 1, which has the benefit sliiightly improving compression. Speed remains pretty much unchanged.

felixhandte added 2 commits December 10, 2021 16:44

Stagger Application of stepSize in ZSTD_fast

22501cd

This replicates the behavior of @terrelln's `ZSTD_fast` implementation. That is, it always looks at adjacent pairs of positions, and only applies the acceleration every other position. This produces a more fine-grained acceleration.

Decompose step into Two Variables

ace6a7e

This avoids an additional addition, at the cost of an additional variable.

facebook-github-bot added the CLA Signed label Dec 10, 2021

felixhandte added 2 commits December 13, 2021 14:46

Allow Templating ZSTD_fast Matchfinders on Acceleration (Lvl < -1)

b8434cb

felixhandte marked this pull request as ready for review December 13, 2021 22:08

felixhandte mentioned this pull request Dec 13, 2021

Compression ratio for --fast=2 and higher became significantly worse. Expected? #2827

Closed

Update Regression Tests w/ New Sizes

450fca9

felixhandte linked an issue Dec 13, 2021 that may be closed by this pull request

Compression ratio for --fast=2 and higher became significantly worse. Expected? #2827

Closed

Cyan4973 approved these changes Dec 14, 2021

View reviewed changes

felixhandte merged commit 5e2fede into facebook:dev Dec 14, 2021

embg mentioned this pull request Apr 19, 2022

Software pipeline for ZSTD_compressBlock_fast_extDict (+4-9% compression speed) #3114

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stagger Stepping in Negative Levels #2921

Stagger Stepping in Negative Levels #2921

felixhandte commented Dec 10, 2021 •

edited

Loading

felixhandte commented Dec 10, 2021 •

edited

Loading

felixhandte commented Dec 13, 2021 •

edited

Loading

Stagger Stepping in Negative Levels #2921

Stagger Stepping in Negative Levels #2921

Conversation

felixhandte commented Dec 10, 2021 • edited Loading

Benchmarks

Status

felixhandte commented Dec 10, 2021 • edited Loading

felixhandte commented Dec 13, 2021 • edited Loading

felixhandte commented Dec 10, 2021 •

edited

Loading

felixhandte commented Dec 10, 2021 •

edited

Loading

felixhandte commented Dec 13, 2021 •

edited

Loading