Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

pulley: Implement 16x8 arithmetics #9904

Merged
merged 4 commits into from
Dec 28, 2024
Merged

Conversation

eagr
Copy link
Contributor

@eagr eagr commented Dec 27, 2024

Helps #9783

@eagr eagr requested review from a team as code owners December 27, 2024 04:31
@eagr eagr requested review from cfallin and dicej and removed request for a team December 27, 2024 04:31
@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator pulley Issues related to the Pulley interpreter labels Dec 27, 2024
Copy link

Subscribe to Label Action

cc @fitzgen

This issue or pull request has been labeled: "cranelift", "pulley"

Thus the following users have been cc'd because of the following labels:

  • fitzgen: pulley

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this!

let b = self.state[operands.src2].get_u16x8();
for (a, b) in a.iter_mut().zip(&b) {
// rounding average
*a = (*a & *b) + ((*a ^ *b) >> 1) + ((*a ^ *b) & 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this perhaps be written as *a = (u32::from(a) + u32::from(b) + 1) / 2? Basically the more classical version of an average which explicitly takes advantage of wider-precision arithmetic to avoid overflow issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instantly convinced :))

@alexcrichton alexcrichton added this pull request to the merge queue Dec 28, 2024
Merged via the queue into bytecodealliance:main with commit 01a43ed Dec 28, 2024
37 checks passed
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
cranelift Issues related to the Cranelift code generator pulley Issues related to the Pulley interpreter
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants