Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Implement "consume" loads on PowerPC #901

Closed
wants to merge 1 commit into from

Conversation

taylordotfish
Copy link

Like ARM, PowerPC is also a weakly ordered architecture where "acquire" loads are more expensive than "consume" loads, which require no special instructions.

Like ARM, PowerPC is also a weakly ordered architecture where
"acquire" loads are more expensive than "consume" loads, which
require no special instructions.
@taiki-e
Copy link
Member

taiki-e commented Aug 29, 2022

Thanks for the PR! Unfortunately, the resulting code is less efficient than acquire load that uses isync + branching because LLVM uses lwsync for powerpc's compiler_fence: https://godbolt.org/z/YG5M5foE4

EDIT: Depending on the CPU, lwsync and isync+branching may be equally efficient, but that is an LLVM issue regarding how to lower acquire load.

@taylordotfish
Copy link
Author

@taiki-e Huh… but why is compiler_fence issuing any kind of *sync at all? Based on the documentation, I would've expected it not to generate any special instructions:

compiler_fence does not emit any machine code, but restricts the kinds of memory re-ordering the compiler is allowed to do.

But empirically, that's clearly not the case…

@taiki-e
Copy link
Member

taiki-e commented Aug 30, 2022

compiler_fence's documentation is incorrect, see rust-lang/rust#62256 for more.

# for free to join this conversation on GitHub. Already have an account? # to comment
Development

Successfully merging this pull request may close these issues.

2 participants