Skip to content

zstd: Rewrite matchLen to make it inlineable #701

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 1 commit into from
Nov 29, 2022

Conversation

greatroar
Copy link
Contributor

@greatroar greatroar commented Nov 27, 2022

Another attempt to get the "best" encoder to run faster. This time, I added tests and did some sanity checks. The code is a bit shorter because I merged two identical functions into one. (There's a nearly identical matchLen function in the flate package; you might want to consider moving it to an internal package.)

name                                 old speed      new speed      delta
Encoder_EncodeAllXML-8                284MB/s ± 1%   283MB/s ± 1%  -0.28%  (p=0.004 n=19+20)
Encoder_EncodeAllSimple/fastest-8     111MB/s ± 0%   112MB/s ± 1%  +0.95%  (p=0.000 n=17+19)
Encoder_EncodeAllSimple/default-8    78.2MB/s ± 1%  77.8MB/s ± 0%  -0.47%  (p=0.000 n=20+19)
Encoder_EncodeAllSimple/better-8     65.6MB/s ± 1%  65.7MB/s ± 1%    ~     (p=0.189 n=20+20)
Encoder_EncodeAllSimple/best-8       11.1MB/s ± 2%  11.8MB/s ± 0%  +6.19%  (p=0.000 n=18+16)
Encoder_EncodeAllSimple4K/fastest-8   912MB/s ± 0%   912MB/s ± 1%    ~     (p=0.815 n=18+18)
Encoder_EncodeAllSimple4K/default-8  72.9MB/s ± 1%  74.1MB/s ± 1%  +1.68%  (p=0.000 n=20+17)
Encoder_EncodeAllSimple4K/better-8   60.5MB/s ± 1%  60.5MB/s ± 1%    ~     (p=0.767 n=20+18)
Encoder_EncodeAllSimple4K/best-8     8.53MB/s ± 2%  8.84MB/s ± 1%  +3.59%  (p=0.000 n=20+20)
Encoder_EncodeAllHTML-8               133MB/s ± 1%   132MB/s ± 1%  -0.62%  (p=0.000 n=20+20)
Encoder_EncodeAllTwain-8             84.8MB/s ± 1%  86.1MB/s ± 1%  +1.51%  (p=0.000 n=20+15)
Encoder_EncodeAllPi-8                62.6MB/s ± 1%  63.2MB/s ± 1%  +1.00%  (p=0.000 n=20+19)
Random4KEncodeAllFastest-8           2.50GB/s ± 1%  2.52GB/s ± 0%  +0.72%  (p=0.000 n=20+19)
Random10MBEncodeAllFastest-8         2.39GB/s ± 1%  2.48GB/s ± 5%    ~     (p=0.121 n=20+20)

name                                 old alloc/op   new alloc/op   delta
Encoder_EncodeAllXML-8                  0.00B          0.00B         ~     (all equal)
Encoder_EncodeAllSimple/fastest-8       2.75B ±27%     3.00B ± 0%    ~     (p=0.062 n=20+18)
Encoder_EncodeAllSimple/default-8       4.00B ± 0%     4.00B ± 0%    ~     (all equal)
Encoder_EncodeAllSimple/better-8        5.00B ± 0%     5.00B ± 0%    ~     (all equal)
Encoder_EncodeAllSimple/best-8          19.3B ± 4%     18.0B ± 0%  -6.74%  (p=0.000 n=20+16)
Encoder_EncodeAllSimple4K/fastest-8     0.00B          0.00B         ~     (all equal)
Encoder_EncodeAllSimple4K/default-8     0.00B          0.00B         ~     (all equal)
Encoder_EncodeAllSimple4K/better-8      0.00B          0.00B         ~     (all equal)
Encoder_EncodeAllSimple4K/best-8        2.00B ± 0%     2.00B ± 0%    ~     (all equal)
Encoder_EncodeAllHTML-8                 2.45B ±22%     2.50B ±20%    ~     (p=1.000 n=20+20)
Encoder_EncodeAllTwain-8                0.00B          0.00B         ~     (all equal)
Encoder_EncodeAllPi-8                   12.4B ± 5%     12.0B ± 0%  -3.23%  (p=0.002 n=20+18)
Random4KEncodeAllFastest-8              0.00B          0.00B         ~     (all equal)
Random10MBEncodeAllFastest-8           32.0kB ± 2%    30.9kB ± 6%    ~     (p=0.114 n=20+20)

fastBase.matchlen is also inlineable.

name                                 old speed      new speed      delta
Encoder_EncodeAllXML-8                284MB/s ± 1%   283MB/s ± 1%  -0.28%  (p=0.004 n=19+20)
Encoder_EncodeAllSimple/fastest-8     111MB/s ± 0%   112MB/s ± 1%  +0.95%  (p=0.000 n=17+19)
Encoder_EncodeAllSimple/default-8    78.2MB/s ± 1%  77.8MB/s ± 0%  -0.47%  (p=0.000 n=20+19)
Encoder_EncodeAllSimple/better-8     65.6MB/s ± 1%  65.7MB/s ± 1%    ~     (p=0.189 n=20+20)
Encoder_EncodeAllSimple/best-8       11.1MB/s ± 2%  11.8MB/s ± 0%  +6.19%  (p=0.000 n=18+16)
Encoder_EncodeAllSimple4K/fastest-8   912MB/s ± 0%   912MB/s ± 1%    ~     (p=0.815 n=18+18)
Encoder_EncodeAllSimple4K/default-8  72.9MB/s ± 1%  74.1MB/s ± 1%  +1.68%  (p=0.000 n=20+17)
Encoder_EncodeAllSimple4K/better-8   60.5MB/s ± 1%  60.5MB/s ± 1%    ~     (p=0.767 n=20+18)
Encoder_EncodeAllSimple4K/best-8     8.53MB/s ± 2%  8.84MB/s ± 1%  +3.59%  (p=0.000 n=20+20)
Encoder_EncodeAllHTML-8               133MB/s ± 1%   132MB/s ± 1%  -0.62%  (p=0.000 n=20+20)
Encoder_EncodeAllTwain-8             84.8MB/s ± 1%  86.1MB/s ± 1%  +1.51%  (p=0.000 n=20+15)
Encoder_EncodeAllPi-8                62.6MB/s ± 1%  63.2MB/s ± 1%  +1.00%  (p=0.000 n=20+19)
Random4KEncodeAllFastest-8           2.50GB/s ± 1%  2.52GB/s ± 0%  +0.72%  (p=0.000 n=20+19)
Random10MBEncodeAllFastest-8         2.39GB/s ± 1%  2.48GB/s ± 5%    ~     (p=0.121 n=20+20)

name                                 old alloc/op   new alloc/op   delta
Encoder_EncodeAllXML-8                  0.00B          0.00B         ~     (all equal)
Encoder_EncodeAllSimple/fastest-8       2.75B ±27%     3.00B ± 0%    ~     (p=0.062 n=20+18)
Encoder_EncodeAllSimple/default-8       4.00B ± 0%     4.00B ± 0%    ~     (all equal)
Encoder_EncodeAllSimple/better-8        5.00B ± 0%     5.00B ± 0%    ~     (all equal)
Encoder_EncodeAllSimple/best-8          19.3B ± 4%     18.0B ± 0%  -6.74%  (p=0.000 n=20+16)
Encoder_EncodeAllSimple4K/fastest-8     0.00B          0.00B         ~     (all equal)
Encoder_EncodeAllSimple4K/default-8     0.00B          0.00B         ~     (all equal)
Encoder_EncodeAllSimple4K/better-8      0.00B          0.00B         ~     (all equal)
Encoder_EncodeAllSimple4K/best-8        2.00B ± 0%     2.00B ± 0%    ~     (all equal)
Encoder_EncodeAllHTML-8                 2.45B ±22%     2.50B ±20%    ~     (p=1.000 n=20+20)
Encoder_EncodeAllTwain-8                0.00B          0.00B         ~     (all equal)
Encoder_EncodeAllPi-8                   12.4B ± 5%     12.0B ± 0%  -3.23%  (p=0.002 n=20+18)
Random4KEncodeAllFastest-8              0.00B          0.00B         ~     (all equal)
Random10MBEncodeAllFastest-8           32.0kB ± 2%    30.9kB ± 6%    ~     (p=0.114 n=20+20)
Copy link
Owner

@klauspost klauspost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@klauspost
Copy link
Owner

Great! Thanks - confirmed the speedup!

@klauspost klauspost merged commit b7c48cb into klauspost:master Nov 29, 2022
@greatroar greatroar deleted the matchlen branch November 29, 2022 16:49
greatroar added a commit to greatroar/compress that referenced this pull request Nov 29, 2022
Benchmark results on amd64 below. These do not take into account klauspost#701.
They're on Go 1.19; Go 1.20 produces slightly better asm for the old
code, but still produces terrible asm on 32-bit platforms.

See also golang/go#56954.

name                                 old speed      new speed       delta
Encoder_EncodeAllXML-8                283MB/s ± 1%    284MB/s ± 0%     ~     (p=0.026 n=30+20)
Encoder_EncodeAllSimple/fastest-8     111MB/s ± 0%    111MB/s ± 1%     ~     (p=0.011 n=28+20)
Encoder_EncodeAllSimple/default-8    78.4MB/s ± 1%   78.3MB/s ± 1%     ~     (p=0.572 n=30+19)
Encoder_EncodeAllSimple/better-8     65.9MB/s ± 1%   66.2MB/s ± 1%   +0.53%  (p=0.009 n=30+20)
Encoder_EncodeAllSimple/best-8       11.1MB/s ± 1%   11.6MB/s ± 3%   +4.42%  (p=0.000 n=27+28)
Encoder_EncodeAllSimple4K/fastest-8   911MB/s ± 1%    914MB/s ± 1%   +0.31%  (p=0.004 n=29+20)
Encoder_EncodeAllSimple4K/default-8  73.1MB/s ± 1%   73.6MB/s ± 1%   +0.67%  (p=0.000 n=29+20)
Encoder_EncodeAllSimple4K/better-8   60.5MB/s ± 1%   62.7MB/s ± 1%   +3.64%  (p=0.000 n=29+17)
Encoder_EncodeAllSimple4K/best-8     8.62MB/s ± 3%  10.11MB/s ± 1%  +17.24%  (p=0.000 n=30+27)
Encoder_EncodeAllHTML-8               133MB/s ± 1%    133MB/s ± 1%     ~     (p=0.101 n=30+19)
Encoder_EncodeAllTwain-8             84.8MB/s ± 1%   86.2MB/s ± 3%   +1.63%  (p=0.000 n=24+20)
Encoder_EncodeAllPi-8                62.6MB/s ± 1%   62.7MB/s ± 0%     ~     (p=0.102 n=30+20)
Random4KEncodeAllFastest-8           2.50GB/s ± 1%   2.50GB/s ± 1%     ~     (p=0.449 n=29+20)
Random10MBEncodeAllFastest-8         2.39GB/s ± 2%   2.52GB/s ± 6%   +5.23%  (p=0.000 n=27+20)

name                                 old alloc/op   new alloc/op    delta
Encoder_EncodeAllXML-8                  0.00B           0.00B          ~     (all equal)
Encoder_EncodeAllSimple/fastest-8       2.73B ±27%      3.00B ± 0%     ~     (p=0.018 n=30+18)
Encoder_EncodeAllSimple/default-8       4.00B ± 0%      4.00B ± 0%     ~     (all equal)
Encoder_EncodeAllSimple/better-8        5.00B ± 0%      5.00B ± 0%     ~     (all equal)
Encoder_EncodeAllSimple/best-8          19.5B ± 3%      19.0B ± 0%   -2.40%  (p=0.000 n=30+24)
Encoder_EncodeAllSimple4K/fastest-8     0.00B           0.00B          ~     (all equal)
Encoder_EncodeAllSimple4K/default-8     0.00B           0.00B          ~     (all equal)
Encoder_EncodeAllSimple4K/better-8      0.00B           0.00B          ~     (all equal)
Encoder_EncodeAllSimple4K/best-8        2.00B ± 0%      1.43B ±40%  -28.33%  (p=0.000 n=30+30)
Encoder_EncodeAllHTML-8                 2.37B ±27%      2.25B ±33%     ~     (p=0.398 n=30+20)
Encoder_EncodeAllTwain-8                0.00B           0.00B          ~     (all equal)
Encoder_EncodeAllPi-8                   12.4B ± 5%      12.2B ± 6%     ~     (p=0.283 n=30+20)
Random4KEncodeAllFastest-8              0.00B           0.00B          ~     (all equal)
Random10MBEncodeAllFastest-8           31.9kB ± 2%     30.5kB ± 9%   -4.27%  (p=0.002 n=28+20)
greatroar added a commit to greatroar/compress that referenced this pull request Nov 29, 2022
Benchmark results on amd64 below. These do not take into account klauspost#701.
They're for Go 1.19; Go 1.20 produces slightly better asm for the old
code, but still produces terrible asm on 32-bit platforms.

See also golang/go#56954.

name                                 old speed      new speed       delta
Encoder_EncodeAllXML-8                283MB/s ± 1%    284MB/s ± 0%     ~     (p=0.026 n=30+20)
Encoder_EncodeAllSimple/fastest-8     111MB/s ± 0%    111MB/s ± 1%     ~     (p=0.011 n=28+20)
Encoder_EncodeAllSimple/default-8    78.4MB/s ± 1%   78.3MB/s ± 1%     ~     (p=0.572 n=30+19)
Encoder_EncodeAllSimple/better-8     65.9MB/s ± 1%   66.2MB/s ± 1%   +0.53%  (p=0.009 n=30+20)
Encoder_EncodeAllSimple/best-8       11.1MB/s ± 1%   11.6MB/s ± 3%   +4.42%  (p=0.000 n=27+28)
Encoder_EncodeAllSimple4K/fastest-8   911MB/s ± 1%    914MB/s ± 1%   +0.31%  (p=0.004 n=29+20)
Encoder_EncodeAllSimple4K/default-8  73.1MB/s ± 1%   73.6MB/s ± 1%   +0.67%  (p=0.000 n=29+20)
Encoder_EncodeAllSimple4K/better-8   60.5MB/s ± 1%   62.7MB/s ± 1%   +3.64%  (p=0.000 n=29+17)
Encoder_EncodeAllSimple4K/best-8     8.62MB/s ± 3%  10.11MB/s ± 1%  +17.24%  (p=0.000 n=30+27)
Encoder_EncodeAllHTML-8               133MB/s ± 1%    133MB/s ± 1%     ~     (p=0.101 n=30+19)
Encoder_EncodeAllTwain-8             84.8MB/s ± 1%   86.2MB/s ± 3%   +1.63%  (p=0.000 n=24+20)
Encoder_EncodeAllPi-8                62.6MB/s ± 1%   62.7MB/s ± 0%     ~     (p=0.102 n=30+20)
Random4KEncodeAllFastest-8           2.50GB/s ± 1%   2.50GB/s ± 1%     ~     (p=0.449 n=29+20)
Random10MBEncodeAllFastest-8         2.39GB/s ± 2%   2.52GB/s ± 6%   +5.23%  (p=0.000 n=27+20)

name                                 old alloc/op   new alloc/op    delta
Encoder_EncodeAllXML-8                  0.00B           0.00B          ~     (all equal)
Encoder_EncodeAllSimple/fastest-8       2.73B ±27%      3.00B ± 0%     ~     (p=0.018 n=30+18)
Encoder_EncodeAllSimple/default-8       4.00B ± 0%      4.00B ± 0%     ~     (all equal)
Encoder_EncodeAllSimple/better-8        5.00B ± 0%      5.00B ± 0%     ~     (all equal)
Encoder_EncodeAllSimple/best-8          19.5B ± 3%      19.0B ± 0%   -2.40%  (p=0.000 n=30+24)
Encoder_EncodeAllSimple4K/fastest-8     0.00B           0.00B          ~     (all equal)
Encoder_EncodeAllSimple4K/default-8     0.00B           0.00B          ~     (all equal)
Encoder_EncodeAllSimple4K/better-8      0.00B           0.00B          ~     (all equal)
Encoder_EncodeAllSimple4K/best-8        2.00B ± 0%      1.43B ±40%  -28.33%  (p=0.000 n=30+30)
Encoder_EncodeAllHTML-8                 2.37B ±27%      2.25B ±33%     ~     (p=0.398 n=30+20)
Encoder_EncodeAllTwain-8                0.00B           0.00B          ~     (all equal)
Encoder_EncodeAllPi-8                   12.4B ± 5%      12.2B ± 6%     ~     (p=0.283 n=30+20)
Random4KEncodeAllFastest-8              0.00B           0.00B          ~     (all equal)
Random10MBEncodeAllFastest-8           31.9kB ± 2%     30.5kB ± 9%   -4.27%  (p=0.002 n=28+20)
greatroar added a commit to greatroar/compress that referenced this pull request Nov 29, 2022
Benchmark results on amd64 below. These do not take into account klauspost#701.
They're for Go 1.19; Go 1.20 produces slightly better asm for the old
code, but still produces pretty bad asm on 32-bit platforms.

See also golang/go#56954.

name                                 old speed      new speed       delta
Encoder_EncodeAllXML-8                283MB/s ± 1%    284MB/s ± 0%     ~     (p=0.026 n=30+20)
Encoder_EncodeAllSimple/fastest-8     111MB/s ± 0%    111MB/s ± 1%     ~     (p=0.011 n=28+20)
Encoder_EncodeAllSimple/default-8    78.4MB/s ± 1%   78.3MB/s ± 1%     ~     (p=0.572 n=30+19)
Encoder_EncodeAllSimple/better-8     65.9MB/s ± 1%   66.2MB/s ± 1%   +0.53%  (p=0.009 n=30+20)
Encoder_EncodeAllSimple/best-8       11.1MB/s ± 1%   11.6MB/s ± 3%   +4.42%  (p=0.000 n=27+28)
Encoder_EncodeAllSimple4K/fastest-8   911MB/s ± 1%    914MB/s ± 1%   +0.31%  (p=0.004 n=29+20)
Encoder_EncodeAllSimple4K/default-8  73.1MB/s ± 1%   73.6MB/s ± 1%   +0.67%  (p=0.000 n=29+20)
Encoder_EncodeAllSimple4K/better-8   60.5MB/s ± 1%   62.7MB/s ± 1%   +3.64%  (p=0.000 n=29+17)
Encoder_EncodeAllSimple4K/best-8     8.62MB/s ± 3%  10.11MB/s ± 1%  +17.24%  (p=0.000 n=30+27)
Encoder_EncodeAllHTML-8               133MB/s ± 1%    133MB/s ± 1%     ~     (p=0.101 n=30+19)
Encoder_EncodeAllTwain-8             84.8MB/s ± 1%   86.2MB/s ± 3%   +1.63%  (p=0.000 n=24+20)
Encoder_EncodeAllPi-8                62.6MB/s ± 1%   62.7MB/s ± 0%     ~     (p=0.102 n=30+20)
Random4KEncodeAllFastest-8           2.50GB/s ± 1%   2.50GB/s ± 1%     ~     (p=0.449 n=29+20)
Random10MBEncodeAllFastest-8         2.39GB/s ± 2%   2.52GB/s ± 6%   +5.23%  (p=0.000 n=27+20)

name                                 old alloc/op   new alloc/op    delta
Encoder_EncodeAllXML-8                  0.00B           0.00B          ~     (all equal)
Encoder_EncodeAllSimple/fastest-8       2.73B ±27%      3.00B ± 0%     ~     (p=0.018 n=30+18)
Encoder_EncodeAllSimple/default-8       4.00B ± 0%      4.00B ± 0%     ~     (all equal)
Encoder_EncodeAllSimple/better-8        5.00B ± 0%      5.00B ± 0%     ~     (all equal)
Encoder_EncodeAllSimple/best-8          19.5B ± 3%      19.0B ± 0%   -2.40%  (p=0.000 n=30+24)
Encoder_EncodeAllSimple4K/fastest-8     0.00B           0.00B          ~     (all equal)
Encoder_EncodeAllSimple4K/default-8     0.00B           0.00B          ~     (all equal)
Encoder_EncodeAllSimple4K/better-8      0.00B           0.00B          ~     (all equal)
Encoder_EncodeAllSimple4K/best-8        2.00B ± 0%      1.43B ±40%  -28.33%  (p=0.000 n=30+30)
Encoder_EncodeAllHTML-8                 2.37B ±27%      2.25B ±33%     ~     (p=0.398 n=30+20)
Encoder_EncodeAllTwain-8                0.00B           0.00B          ~     (all equal)
Encoder_EncodeAllPi-8                   12.4B ± 5%      12.2B ± 6%     ~     (p=0.283 n=30+20)
Random4KEncodeAllFastest-8              0.00B           0.00B          ~     (all equal)
Random10MBEncodeAllFastest-8           31.9kB ± 2%     30.5kB ± 9%   -4.27%  (p=0.002 n=28+20)
klauspost added a commit that referenced this pull request Nov 30, 2022
Benchmark results on amd64 below. These do not take into account #701.
They're for Go 1.19; Go 1.20 produces slightly better asm for the old
code, but still produces pretty bad asm on 32-bit platforms.

See also golang/go#56954.

name                                 old speed      new speed       delta
Encoder_EncodeAllXML-8                283MB/s ± 1%    284MB/s ± 0%     ~     (p=0.026 n=30+20)
Encoder_EncodeAllSimple/fastest-8     111MB/s ± 0%    111MB/s ± 1%     ~     (p=0.011 n=28+20)
Encoder_EncodeAllSimple/default-8    78.4MB/s ± 1%   78.3MB/s ± 1%     ~     (p=0.572 n=30+19)
Encoder_EncodeAllSimple/better-8     65.9MB/s ± 1%   66.2MB/s ± 1%   +0.53%  (p=0.009 n=30+20)
Encoder_EncodeAllSimple/best-8       11.1MB/s ± 1%   11.6MB/s ± 3%   +4.42%  (p=0.000 n=27+28)
Encoder_EncodeAllSimple4K/fastest-8   911MB/s ± 1%    914MB/s ± 1%   +0.31%  (p=0.004 n=29+20)
Encoder_EncodeAllSimple4K/default-8  73.1MB/s ± 1%   73.6MB/s ± 1%   +0.67%  (p=0.000 n=29+20)
Encoder_EncodeAllSimple4K/better-8   60.5MB/s ± 1%   62.7MB/s ± 1%   +3.64%  (p=0.000 n=29+17)
Encoder_EncodeAllSimple4K/best-8     8.62MB/s ± 3%  10.11MB/s ± 1%  +17.24%  (p=0.000 n=30+27)
Encoder_EncodeAllHTML-8               133MB/s ± 1%    133MB/s ± 1%     ~     (p=0.101 n=30+19)
Encoder_EncodeAllTwain-8             84.8MB/s ± 1%   86.2MB/s ± 3%   +1.63%  (p=0.000 n=24+20)
Encoder_EncodeAllPi-8                62.6MB/s ± 1%   62.7MB/s ± 0%     ~     (p=0.102 n=30+20)
Random4KEncodeAllFastest-8           2.50GB/s ± 1%   2.50GB/s ± 1%     ~     (p=0.449 n=29+20)
Random10MBEncodeAllFastest-8         2.39GB/s ± 2%   2.52GB/s ± 6%   +5.23%  (p=0.000 n=27+20)

name                                 old alloc/op   new alloc/op    delta
Encoder_EncodeAllXML-8                  0.00B           0.00B          ~     (all equal)
Encoder_EncodeAllSimple/fastest-8       2.73B ±27%      3.00B ± 0%     ~     (p=0.018 n=30+18)
Encoder_EncodeAllSimple/default-8       4.00B ± 0%      4.00B ± 0%     ~     (all equal)
Encoder_EncodeAllSimple/better-8        5.00B ± 0%      5.00B ± 0%     ~     (all equal)
Encoder_EncodeAllSimple/best-8          19.5B ± 3%      19.0B ± 0%   -2.40%  (p=0.000 n=30+24)
Encoder_EncodeAllSimple4K/fastest-8     0.00B           0.00B          ~     (all equal)
Encoder_EncodeAllSimple4K/default-8     0.00B           0.00B          ~     (all equal)
Encoder_EncodeAllSimple4K/better-8      0.00B           0.00B          ~     (all equal)
Encoder_EncodeAllSimple4K/best-8        2.00B ± 0%      1.43B ±40%  -28.33%  (p=0.000 n=30+30)
Encoder_EncodeAllHTML-8                 2.37B ±27%      2.25B ±33%     ~     (p=0.398 n=30+20)
Encoder_EncodeAllTwain-8                0.00B           0.00B          ~     (all equal)
Encoder_EncodeAllPi-8                   12.4B ± 5%      12.2B ± 6%     ~     (p=0.283 n=30+20)
Random4KEncodeAllFastest-8              0.00B           0.00B          ~     (all equal)
Random10MBEncodeAllFastest-8           31.9kB ± 2%     30.5kB ± 9%   -4.27%  (p=0.002 n=28+20)

Co-authored-by: Klaus Post <klauspost@gmail.com>
kodiakhq bot referenced this pull request in cloudquery/cloudquery Feb 1, 2023
…7575)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/klauspost/compress](https://github.com/klauspost/compress) | indirect | patch | `v1.15.11` -> `v1.15.15` |

---

### Release Notes

<details>
<summary>klauspost/compress</summary>

### [`v1.15.15`](https://github.com/klauspost/compress/releases/tag/v1.15.15)

[Compare Source](https://github.com/klauspost/compress/compare/v1.15.14...v1.15.15)

##### What's Changed

-   zstd: Add delta encoding support by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/728](https://github.com/klauspost/compress/pull/728)
-   huff0: Reduce bounds checking by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/734](https://github.com/klauspost/compress/pull/734)
-   huff0: Assembler improvements by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/736](https://github.com/klauspost/compress/pull/736)
-   deflate: Improve level 7-9 by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/739](https://github.com/klauspost/compress/pull/739)
-   gzhttp: Add SuffixETag() and DropETag() options to prevent ETag collisions on compressed responses by [@&#8203;willbicks](https://github.com/willbicks) in [https://github.com/klauspost/compress/pull/740](https://github.com/klauspost/compress/pull/740)
-   zstd: Don't allocate dataStorage when using byteBuf by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/741](https://github.com/klauspost/compress/pull/741)
-   huff0: Speed up compression of short blocks by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/744](https://github.com/klauspost/compress/pull/744)
-   zstd: Handle dicts by pointer, always by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/743](https://github.com/klauspost/compress/pull/743)
-   fse: Optimize compression by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/745](https://github.com/klauspost/compress/pull/745)
-   Retract v1.14.1-v.1.14.3 by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/742](https://github.com/klauspost/compress/pull/742)

##### New Contributors

-   [@&#8203;willbicks](https://github.com/willbicks) made their first contribution in [https://github.com/klauspost/compress/pull/740](https://github.com/klauspost/compress/pull/740)

**Full Changelog**: klauspost/compress@v1.15.14...v1.15.15

### [`v1.15.14`](https://github.com/klauspost/compress/releases/tag/v1.15.14)

[Compare Source](https://github.com/klauspost/compress/compare/v1.15.13...v1.15.14)

#### What's Changed

-   flate: Improve speed in big stateless blocks. by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/718](https://github.com/klauspost/compress/pull/718)
-   zstd: Trigger BCE by switching on lengths by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/716](https://github.com/klauspost/compress/pull/716)
-   zstd: Shave some instructions off the amd64 asm by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/720](https://github.com/klauspost/compress/pull/720)
-   export NoGzipResponseWriter for custom ResponseWriter wrappers by [@&#8203;harshavardhana](https://github.com/harshavardhana) in [https://github.com/klauspost/compress/pull/722](https://github.com/klauspost/compress/pull/722)
-   s2: Add example for indexing and existing stream by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/723](https://github.com/klauspost/compress/pull/723)
-   tests: Tweak fuzz tests by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/719](https://github.com/klauspost/compress/pull/719)

#### New Contributors

-   [@&#8203;harshavardhana](https://github.com/harshavardhana) made their first contribution in [https://github.com/klauspost/compress/pull/722](https://github.com/klauspost/compress/pull/722)

**Full Changelog**: klauspost/compress@v1.15.13...v1.15.14

### [`v1.15.13`](https://github.com/klauspost/compress/releases/tag/v1.15.13)

[Compare Source](https://github.com/klauspost/compress/compare/v1.15.12...v1.15.13)

#### What's Changed

-   zstd: Add MaxEncodedSize to encoder by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/691](https://github.com/klauspost/compress/pull/691)
-   zstd: Improve "best" end search by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/693](https://github.com/klauspost/compress/pull/693)
-   zstd: Replace bytes.Equal with smaller comparisons by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/695](https://github.com/klauspost/compress/pull/695)
-   zstd: Faster CRC checking/skipping by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/696](https://github.com/klauspost/compress/pull/696)
-   zstd: Rewrite matchLen to make it inlineable by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/701](https://github.com/klauspost/compress/pull/701)
-   zstd: Write table clearing in a way that the compiler recognizes by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/702](https://github.com/klauspost/compress/pull/702)
-   zstd: Use individual reset threshold by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/703](https://github.com/klauspost/compress/pull/703)
-   huff0: Check for zeros earlier in Scratch.countSimple by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/704](https://github.com/klauspost/compress/pull/704)
-   zstd: Improve best compression's match selection by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/705](https://github.com/klauspost/compress/pull/705)
-   zstd: Select best match using selection trees by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/706](https://github.com/klauspost/compress/pull/706)
-   zstd: sync xxhash with final accepted patch upstream by [@&#8203;lizthegrey](https://github.com/lizthegrey) in [https://github.com/klauspost/compress/pull/707](https://github.com/klauspost/compress/pull/707)
-   zstd: Import xxhash v2.2.0 by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/708](https://github.com/klauspost/compress/pull/708)

**Full Changelog**: klauspost/compress@v1.15.12...v1.15.13

### [`v1.15.12`](https://github.com/klauspost/compress/releases/tag/v1.15.12)

[Compare Source](https://github.com/klauspost/compress/compare/v1.15.11...v1.15.12)

##### What's Changed

-   zstd: Tweak decoder allocs. by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/680](https://github.com/klauspost/compress/pull/680)
-   gzhttp: Always delete `HeaderNoCompression` by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/683](https://github.com/klauspost/compress/pull/683)

**Full Changelog**: klauspost/compress@v1.15.11...v1.15.12

</details>

---

### Configuration

📅 **Schedule**: Branch creation - "before 3am on the first day of the month" (UTC), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNC4xMDkuMSIsInVwZGF0ZWRJblZlciI6IjM0LjEwOS4xIn0=-->
kodiakhq bot referenced this pull request in cloudquery/filetypes Mar 1, 2023
This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/klauspost/compress](https://github.com/klauspost/compress) | indirect | minor | `v1.15.11` -> `v1.16.0` |

---

### ⚠ Dependency Lookup Warnings ⚠

Warnings were logged while processing this repo. Please check the Dependency Dashboard for more information.

---

### Release Notes

<details>
<summary>klauspost/compress</summary>

### [`v1.16.0`](https://github.com/klauspost/compress/releases/tag/v1.16.0)

[Compare Source](https://github.com/klauspost/compress/compare/v1.15.15...v1.16.0)

#### What's Changed

-   s2: Add Dictionary support by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/685](https://github.com/klauspost/compress/pull/685)
-   s2: Add Compression Size Estimate by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/752](https://github.com/klauspost/compress/pull/752)
-   s2: Add support for custom stream encoder by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/755](https://github.com/klauspost/compress/pull/755)
-   s2: Add LZ4 block converter by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/748](https://github.com/klauspost/compress/pull/748)
-   s2: Support io.ReaderAt in ReadSeeker by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/747](https://github.com/klauspost/compress/pull/747)
-   s2c/s2sx: Use concurrent decoding by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/746](https://github.com/klauspost/compress/pull/746)
-   tests: Upgrade to Go 1.20 by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/749](https://github.com/klauspost/compress/pull/749)
-   Update all (command) dependencies by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/758](https://github.com/klauspost/compress/pull/758)

**Full Changelog**: klauspost/compress@v1.15.15...v1.16.0

### [`v1.15.15`](https://github.com/klauspost/compress/releases/tag/v1.15.15)

[Compare Source](https://github.com/klauspost/compress/compare/v1.15.14...v1.15.15)

#### What's Changed

-   zstd: Add delta encoding support by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/728](https://github.com/klauspost/compress/pull/728)
-   huff0: Reduce bounds checking by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/734](https://github.com/klauspost/compress/pull/734)
-   huff0: Assembler improvements by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/736](https://github.com/klauspost/compress/pull/736)
-   deflate: Improve level 7-9 by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/739](https://github.com/klauspost/compress/pull/739)
-   gzhttp: Add SuffixETag() and DropETag() options to prevent ETag collisions on compressed responses by [@&#8203;willbicks](https://github.com/willbicks) in [https://github.com/klauspost/compress/pull/740](https://github.com/klauspost/compress/pull/740)
-   zstd: Don't allocate dataStorage when using byteBuf by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/741](https://github.com/klauspost/compress/pull/741)
-   huff0: Speed up compression of short blocks by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/744](https://github.com/klauspost/compress/pull/744)
-   zstd: Handle dicts by pointer, always by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/743](https://github.com/klauspost/compress/pull/743)
-   fse: Optimize compression by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/745](https://github.com/klauspost/compress/pull/745)
-   Retract v1.14.1-v.1.14.3 by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/742](https://github.com/klauspost/compress/pull/742)

#### New Contributors

-   [@&#8203;willbicks](https://github.com/willbicks) made their first contribution in [https://github.com/klauspost/compress/pull/740](https://github.com/klauspost/compress/pull/740)

**Full Changelog**: klauspost/compress@v1.15.14...v1.15.15

### [`v1.15.14`](https://github.com/klauspost/compress/releases/tag/v1.15.14)

[Compare Source](https://github.com/klauspost/compress/compare/v1.15.13...v1.15.14)

#### What's Changed

-   flate: Improve speed in big stateless blocks. by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/718](https://github.com/klauspost/compress/pull/718)
-   zstd: Trigger BCE by switching on lengths by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/716](https://github.com/klauspost/compress/pull/716)
-   zstd: Shave some instructions off the amd64 asm by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/720](https://github.com/klauspost/compress/pull/720)
-   export NoGzipResponseWriter for custom ResponseWriter wrappers by [@&#8203;harshavardhana](https://github.com/harshavardhana) in [https://github.com/klauspost/compress/pull/722](https://github.com/klauspost/compress/pull/722)
-   s2: Add example for indexing and existing stream by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/723](https://github.com/klauspost/compress/pull/723)
-   tests: Tweak fuzz tests by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/719](https://github.com/klauspost/compress/pull/719)

#### New Contributors

-   [@&#8203;harshavardhana](https://github.com/harshavardhana) made their first contribution in [https://github.com/klauspost/compress/pull/722](https://github.com/klauspost/compress/pull/722)

**Full Changelog**: klauspost/compress@v1.15.13...v1.15.14

### [`v1.15.13`](https://github.com/klauspost/compress/releases/tag/v1.15.13)

[Compare Source](https://github.com/klauspost/compress/compare/v1.15.12...v1.15.13)

#### What's Changed

-   zstd: Add MaxEncodedSize to encoder by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/691](https://github.com/klauspost/compress/pull/691)
-   zstd: Improve "best" end search by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/693](https://github.com/klauspost/compress/pull/693)
-   zstd: Replace bytes.Equal with smaller comparisons by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/695](https://github.com/klauspost/compress/pull/695)
-   zstd: Faster CRC checking/skipping by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/696](https://github.com/klauspost/compress/pull/696)
-   zstd: Rewrite matchLen to make it inlineable by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/701](https://github.com/klauspost/compress/pull/701)
-   zstd: Write table clearing in a way that the compiler recognizes by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/702](https://github.com/klauspost/compress/pull/702)
-   zstd: Use individual reset threshold by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/703](https://github.com/klauspost/compress/pull/703)
-   huff0: Check for zeros earlier in Scratch.countSimple by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/704](https://github.com/klauspost/compress/pull/704)
-   zstd: Improve best compression's match selection by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/705](https://github.com/klauspost/compress/pull/705)
-   zstd: Select best match using selection trees by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/706](https://github.com/klauspost/compress/pull/706)
-   zstd: sync xxhash with final accepted patch upstream by [@&#8203;lizthegrey](https://github.com/lizthegrey) in [https://github.com/klauspost/compress/pull/707](https://github.com/klauspost/compress/pull/707)
-   zstd: Import xxhash v2.2.0 by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/708](https://github.com/klauspost/compress/pull/708)

**Full Changelog**: klauspost/compress@v1.15.12...v1.15.13

### [`v1.15.12`](https://github.com/klauspost/compress/releases/tag/v1.15.12)

[Compare Source](https://github.com/klauspost/compress/compare/v1.15.11...v1.15.12)

#### What's Changed

-   zstd: Tweak decoder allocs. by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/680](https://github.com/klauspost/compress/pull/680)
-   gzhttp: Always delete `HeaderNoCompression` by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/683](https://github.com/klauspost/compress/pull/683)

**Full Changelog**: klauspost/compress@v1.15.11...v1.15.12

</details>

---

### Configuration

📅 **Schedule**: Branch creation - "before 3am on the first day of the month" (UTC), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNC4xMDkuMSIsInVwZGF0ZWRJblZlciI6IjM0LjE1NC4wIn0=-->
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants