-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
pulley: Implement 16x8 arithmetics #9904
Conversation
Subscribe to Label Actioncc @fitzgen
This issue or pull request has been labeled: "cranelift", "pulley"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this!
pulley/src/interp.rs
Outdated
let b = self.state[operands.src2].get_u16x8(); | ||
for (a, b) in a.iter_mut().zip(&b) { | ||
// rounding average | ||
*a = (*a & *b) + ((*a ^ *b) >> 1) + ((*a ^ *b) & 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this perhaps be written as *a = (u32::from(a) + u32::from(b) + 1) / 2
? Basically the more classical version of an average which explicitly takes advantage of wider-precision arithmetic to avoid overflow issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instantly convinced :))
Helps #9783