-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Optimize copying large ranges of undefmask blocks #58556
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Conversation
r? @pnkfelix (rust_highfive has picked a reviewer for you, use r? to override) |
@bors try |
Optimize copying large ranges of undefmask blocks Hopefully fixes #58523
@rust-timer build ab1e694 |
Success: Queued ab1e694 with parent d215d95, comparison URL. |
☀️ Test successful - checks-travis |
Finished benchmarking try commit ab1e694 |
Local tests have shown that my stage 1 compiler needs 8 seconds for fn main() {
(&[0; 1 << 28]);
} while my stage 0 compiler needs 23 seconds. I'm not sure how to add such a test to the perf test suite without causing significant slowdown of the entire suite. For smaller arrays ( |
@bors try |
⌛ Trying commit 5aacd4c5c019169c0d8c77e39e1f78ac0931909d with merge 5f394731e058952a24cf21f80ba4b1bfef28f30f... |
@rust-timer build 5f394731e058952a24cf21f80ba4b1bfef28f30f |
Success: Queued 5f394731e058952a24cf21f80ba4b1bfef28f30f with parent f66e469, comparison URL. |
☀️ Test successful - checks-travis |
Finished benchmarking try commit 5f394731e058952a24cf21f80ba4b1bfef28f30f |
perf shows a small (less than 1.5%) but across the board improvement Once rust-lang/rustc-perf#349 is merged, we should see a big improvement, but I don't see a reason to wait for that. |
With rust-lang/rustc-perf#349 merged, is this ready to move forward? |
@bors try |
@bors ping |
😪 I'm awake I'm awake |
@bors try |
☀️ Try build successful - checks-travis |
Finished benchmarking try commit af4ce587f67d7256b7a647f1ad5b14e266bdb69a |
@pnkfelix this is ready for review and perf looks green |
pub fn new(size: Size) -> Self { | ||
pub const BLOCK_SIZE: u64 = 64; | ||
|
||
pub fn new(size: Size, state: bool) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I infer that state
here is interpreted as { true
=> defined, false
=> undefined }, (right?)
You might consider adding a comment above the header saying so. (My initial interpretation of "undef mask" was that if the bit is true, then it is undefined)
// across block boundaries | ||
if new_state { | ||
// set bita..64 to 1 | ||
self.blocks[blocka] |= u64::max_value() << bita; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(aside: i'm sort of amazed libstd doesn't have named methods for these operations; I would think turning big ranges of bits on or off within a uN
(or better still, arbitrary arrays or vectors of [uN]
), would be so common as to have higher-level methods than shifts and bitwise-or-masking.)
Discussed some implementation details with @oli-obk on zulip, namely about the motivation for the run-length encoded form of the undef-mask used in I'm satisfied that the optimization tends to help more than it hurts. |
@bors r+ |
📌 Commit 2a1eb1c has been approved by |
…felix Optimize copying large ranges of undefmask blocks Hopefully fixes rust-lang#58523
⌛ Testing commit 2a1eb1c with merge 8a042506b6c9abd9785883d508244a668250b4cb... |
💔 Test failed - checks-travis |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
@bors retry 3 hour timeout? |
Optimize copying large ranges of undefmask blocks Hopefully fixes #58523
☀️ Test successful - checks-travis, status-appveyor |
Hopefully fixes #58523