-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Unnecessary copy when constructing arrays from returned arrays? #62446
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Comments
What you're looking for is commonly known as "named return value optimization". It's currently not implemented in rust but progress is being tracked in #32966. In the meantime, if you are willing to take an out-param ( |
LLVM isn't very good at optimizing away memcpy's. I think it would be possible to eliminate one of the memcpy's by extending call slot optimization to work with a destination that is a GEP of alloca. |
Together with a few other changes, https://reviews.llvm.org/D89623 will remove the last memcpy in g1 and g4. It also removes one of the memcpys in g3. g2 would require mutable noalias to be reenabled. |
On beta we generate 1, 2, 2, 1 memcpys. On nightly we generate 1, 2, 1, 0 memcpys. With Not sure why the one memcpy in the first function is still there. It gets dropped when I run |
For
The problem is that we have a clobbering |
The memcpy in the first example should be eliminated by llvm/llvm-project@9080444 and llvm/llvm-project@2902bde (possibly the first one isn't needed) by side-stepping the phase ordering issue. MemCpyOpt happened to implement partial DSE for this particular case already. |
After #87570 we're now down to 0, 0, 1, 0 memcpys. |
I think this is about as fixed as it's going to get, but it may be worthwhile to add some codegen tests for it. |
This now regresses in first 3 cases: https://godbolt.org/z/W6ExP7hdc |
I believe the regression in g2 is actually a correctness fix in that we don't have consensus that the optimization would be valid (cc @RalfJung for another example where we need "spurious store" on |
What's the g2 you are referring to? Some comment above also mentions a g1 and a g4. But the issue description only has a Anyway I added this to rust-lang/unsafe-code-guidelines#133. |
@RalfJung It refers to the godbolt links, e.g. https://godbolt.org/z/W6ExP7hdc. |
Okay so the issue description is extremely outdated then. That's quite confusing when coming anew to this issue. |
I want to construct a big array by concatenating smaller arrays returned by other functions. As a simple example:
Rust nightly generates a call to memcpy.
Is there a way to prevent this memcpy? Am I missing obvious other way to write this function?
Of course one could rewrite the called function
f
to take a&mut [u64]
instead of returning the array, but that removes compile-time checks on the length and introduces bounds checks. Using&mut [u64;40]
as an "out" argument solves that problem, but then I don't see a safe way to get two&mut [u64;40]
into[u64;80]
without usingtransmute
.(Background: I'm implementing the XMSSMT hash-based signature in Rust, which involves concatenating lots of hashes. The usual Rust hash library returns an array (actually a
GenericArray
) instead of using a&mut [u64;...]
parameter which led me to believe that the copy could be optimised away.)The text was updated successfully, but these errors were encountered: