Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

encoding: use AliasedUint32Array for encodeInto results #46658

Closed
wants to merge 5 commits into from

Conversation

joyeecheung
Copy link
Member

The first commits comes from #46620

benchmark: stablize encode benchmark

  • Increase the number of iteration to 1e6 to reduce flakes. 1e4
    can introduce flakes even when comparing the main branch
    against itself
  • Replace the 1024 * 32 length test with 1024 * 8 since it would
    otherwise take too long to complete.
  • Check the results of the encoding methods at the end.

src: move encoding bindings to a new binding

Move the bindings used by TextEncoder to a new binding for
more self-contained code.

encoding: use AliasedUint32Array for encodeInto results

Getting the buffer from a TypedArray created from the JS land
incurs a copy. For encodeInto() results we can just use an
AliasedArray and let the binding always own the store.

                                                                               confidence improvement accuracy (*)   (**)  (***)
util/text-encoder.js op='encodeInto' type='ascii' n=1000000 len=1024                           0.41 %       ±1.33% ±1.78% ±2.33%
util/text-encoder.js op='encodeInto' type='ascii' n=1000000 len=16                    ***      3.66 %       ±1.35% ±1.79% ±2.33%
util/text-encoder.js op='encodeInto' type='ascii' n=1000000 len=256                            0.29 %       ±0.88% ±1.17% ±1.53%
util/text-encoder.js op='encodeInto' type='ascii' n=1000000 len=32                    ***      4.01 %       ±1.34% ±1.79% ±2.34%
util/text-encoder.js op='encodeInto' type='ascii' n=1000000 len=8192                          -0.32 %       ±0.66% ±0.88% ±1.15%
util/text-encoder.js op='encodeInto' type='one-byte-string' n=1000000 len=1024          *     -0.48 %       ±0.37% ±0.50% ±0.65%
util/text-encoder.js op='encodeInto' type='one-byte-string' n=1000000 len=16          ***      5.39 %       ±2.49% ±3.34% ±4.41%
util/text-encoder.js op='encodeInto' type='one-byte-string' n=1000000 len=256         ***      1.44 %       ±0.48% ±0.63% ±0.83%
util/text-encoder.js op='encodeInto' type='one-byte-string' n=1000000 len=32          ***      4.67 %       ±1.01% ±1.34% ±1.74%
util/text-encoder.js op='encodeInto' type='one-byte-string' n=1000000 len=8192                 0.15 %       ±0.24% ±0.32% ±0.42%
util/text-encoder.js op='encodeInto' type='two-byte-string' n=1000000 len=1024        ***      1.29 %       ±0.49% ±0.66% ±0.86%
util/text-encoder.js op='encodeInto' type='two-byte-string' n=1000000 len=16                   2.23 %       ±2.90% ±3.90% ±5.16%
util/text-encoder.js op='encodeInto' type='two-byte-string' n=1000000 len=256         ***      1.10 %       ±0.48% ±0.64% ±0.84%
util/text-encoder.js op='encodeInto' type='two-byte-string' n=1000000 len=32          ***      6.31 %       ±1.60% ±2.14% ±2.82%
util/text-encoder.js op='encodeInto' type='two-byte-string' n=1000000 len=8192                 0.19 %       ±0.21% ±0.28% ±0.37%

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/gyp
  • @nodejs/http
  • @nodejs/http2
  • @nodejs/net
  • @nodejs/startup

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels Feb 14, 2023
@anonrig anonrig added the performance Issues and PRs related to the performance of Node.js. label Feb 14, 2023
@nodejs-github-bot
Copy link
Collaborator

- Increase the number of iteration to 1e6 to reduce flakes. 1e4
  can introduce flakes even when comparing the main branch
  against itself
- Replace the 1024 * 32 length test with 1024 * 8 since it would
  otherwise take too long to complete. Remove the 16 length test
  since it's not too different from 32.
- Check the results of the encoding methods at the end.
Move the bindings used by TextEncoder to a new binding for
more self-contained code.
Getting the buffer from a TypedArray created from the JS land
incurs a copy. For encodeInto() results we can just use an
AliasedArray and let the binding always own the store.
@joyeecheung
Copy link
Member Author

Rebased to fix merge conflict

@nodejs-github-bot
Copy link
Collaborator

@joyeecheung
Copy link
Member Author

joyeecheung commented Mar 3, 2023

cc @nodejs/buffer @nodejs/cpp-reviewers can I have some reviews please? Thanks.

Copy link
Contributor

@RaisinTen RaisinTen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code change LGTM (% lint fixes) but I'm curious about this part:

Getting the buffer from a TypedArray created from the JS land
incurs a copy.

Where does the copy take place?

@joyeecheung
Copy link
Member Author

Where does the copy take place?

v8::ArrayBufferView::Buffer().

@joyeecheung
Copy link
Member Author

Fixed linter error (looks like the bot is not working again: https://ci.nodejs.org/job/node-test-pull-request/50249/)

@nodejs-github-bot
Copy link
Collaborator

joyeecheung added a commit that referenced this pull request Mar 7, 2023
- Increase the number of iteration to 1e6 to reduce flakes. 1e4
  can introduce flakes even when comparing the main branch
  against itself
- Replace the 1024 * 32 length test with 1024 * 8 since it would
  otherwise take too long to complete. Remove the 16 length test
  since it's not too different from 32.
- Check the results of the encoding methods at the end.

PR-URL: #46658
Reviewed-By: Darshan Sen <raisinten@gmail.com>
joyeecheung added a commit that referenced this pull request Mar 7, 2023
Move the bindings used by TextEncoder to a new binding for
more self-contained code.

PR-URL: #46658
Reviewed-By: Darshan Sen <raisinten@gmail.com>
joyeecheung added a commit that referenced this pull request Mar 7, 2023
Getting the buffer from a TypedArray created from the JS land
incurs a copy. For encodeInto() results we can just use an
AliasedArray and let the binding always own the store.

PR-URL: #46658
Reviewed-By: Darshan Sen <raisinten@gmail.com>
@joyeecheung
Copy link
Member Author

Landed in 3b0c047...e5933c8

@MoLow
Copy link
Member

MoLow commented Mar 8, 2023

this commit has landed without a passing CI/GH actions, introducing a lint error: #47007

@cjihrig
Copy link
Contributor

cjihrig commented Mar 8, 2023

Linter issue is addressed in #47003.

@joyeecheung
Copy link
Member Author

joyeecheung commented Mar 8, 2023

The CI was actually passing in https://ci.nodejs.org/job/node-test-pull-request/50258/, which included the fixup commit that should remove the unused using statement causing the lint error: 4ea7f2e. But somehow git was only able to apply that fixup commit partially during landing, and I didn't see any warnings?

targos pushed a commit that referenced this pull request Mar 13, 2023
- Increase the number of iteration to 1e6 to reduce flakes. 1e4
  can introduce flakes even when comparing the main branch
  against itself
- Replace the 1024 * 32 length test with 1024 * 8 since it would
  otherwise take too long to complete. Remove the 16 length test
  since it's not too different from 32.
- Check the results of the encoding methods at the end.

PR-URL: #46658
Reviewed-By: Darshan Sen <raisinten@gmail.com>
targos pushed a commit that referenced this pull request Mar 13, 2023
Move the bindings used by TextEncoder to a new binding for
more self-contained code.

PR-URL: #46658
Reviewed-By: Darshan Sen <raisinten@gmail.com>
targos pushed a commit that referenced this pull request Mar 13, 2023
Getting the buffer from a TypedArray created from the JS land
incurs a copy. For encodeInto() results we can just use an
AliasedArray and let the binding always own the store.

PR-URL: #46658
Reviewed-By: Darshan Sen <raisinten@gmail.com>
targos pushed a commit that referenced this pull request Mar 14, 2023
- Increase the number of iteration to 1e6 to reduce flakes. 1e4
  can introduce flakes even when comparing the main branch
  against itself
- Replace the 1024 * 32 length test with 1024 * 8 since it would
  otherwise take too long to complete. Remove the 16 length test
  since it's not too different from 32.
- Check the results of the encoding methods at the end.

PR-URL: #46658
Reviewed-By: Darshan Sen <raisinten@gmail.com>
targos pushed a commit that referenced this pull request Mar 14, 2023
Move the bindings used by TextEncoder to a new binding for
more self-contained code.

PR-URL: #46658
Reviewed-By: Darshan Sen <raisinten@gmail.com>
targos pushed a commit that referenced this pull request Mar 14, 2023
Getting the buffer from a TypedArray created from the JS land
incurs a copy. For encodeInto() results we can just use an
AliasedArray and let the binding always own the store.

PR-URL: #46658
Reviewed-By: Darshan Sen <raisinten@gmail.com>
@danielleadams
Copy link
Contributor

danielleadams commented Apr 3, 2023

@joyeecheung This breaks the build when landing in v18.x - do you mind opening a backport PR?

@danielleadams danielleadams added the backport-requested-v18.x PRs awaiting manual backport to the v18.x-staging branch. label Apr 3, 2023
CHECK_NOT_NULL(binding);
}

void BindingData::EncodeInto(const FunctionCallbackInfo<Value>& args) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity I exploring places where fast API could be applied, came across this function and PR, I experimented a bit it seems v8::FastApiTypedArray<uint8_t>& is indeed supported and it does get caught in fast call, but I am not sure how one could extract ->Buffer() from v8::FastApiTypedArray<uint8_t>&, is that even possible? just wondering do let me know your thoughts cc @anonrig @joyeecheung

Thank You!

targos pushed a commit that referenced this pull request Nov 10, 2023
- Increase the number of iteration to 1e6 to reduce flakes. 1e4
  can introduce flakes even when comparing the main branch
  against itself
- Replace the 1024 * 32 length test with 1024 * 8 since it would
  otherwise take too long to complete. Remove the 16 length test
  since it's not too different from 32.
- Check the results of the encoding methods at the end.

PR-URL: #46658
Reviewed-By: Darshan Sen <raisinten@gmail.com>
sercher added a commit to sercher/graaljs that referenced this pull request Apr 25, 2024
- Increase the number of iteration to 1e6 to reduce flakes. 1e4
  can introduce flakes even when comparing the main branch
  against itself
- Replace the 1024 * 32 length test with 1024 * 8 since it would
  otherwise take too long to complete. Remove the 16 length test
  since it's not too different from 32.
- Check the results of the encoding methods at the end.

PR-URL: nodejs/node#46658
Reviewed-By: Darshan Sen <raisinten@gmail.com>
sercher added a commit to sercher/graaljs that referenced this pull request Apr 25, 2024
- Increase the number of iteration to 1e6 to reduce flakes. 1e4
  can introduce flakes even when comparing the main branch
  against itself
- Replace the 1024 * 32 length test with 1024 * 8 since it would
  otherwise take too long to complete. Remove the 16 length test
  since it's not too different from 32.
- Check the results of the encoding methods at the end.

PR-URL: nodejs/node#46658
Reviewed-By: Darshan Sen <raisinten@gmail.com>
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
backport-requested-v18.x PRs awaiting manual backport to the v18.x-staging branch. c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. performance Issues and PRs related to the performance of Node.js.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants