Skip to content

Suggestion: 'force_le' feature flag? #1179

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
dhardy opened this issue Sep 11, 2021 · 5 comments
Closed

Suggestion: 'force_le' feature flag? #1179

dhardy opened this issue Sep 11, 2021 · 5 comments

Comments

@dhardy
Copy link
Member

dhardy commented Sep 11, 2021

For the sake of portable results, we force the use of LE on byte-int and int-byte conversions. This isn't always desired, but to avoid multiple code-paths we do it anyway.

However, by using a feature flag, we don't need multiple compiled code paths (and the multiple code versions exist anyway, where this is a significant optimisation on LE). We could add a force_le feature flag to cause current behaviour on BE platforms, using NE by default.

This would be a breaking change (target 0.9; #1165).

@vks
Copy link
Collaborator

vks commented Sep 11, 2021

What's the motivation for this? Better performance on big-endian platforms?

@newpavlov
Copy link
Member

I don't think it's a good idea. If force_le is enabled by default, it will be virtually impossible to disable it in any sizable project, and if not, it certainly will be a portability headache. Also enabling it for a whole rand looks like overly coarse-grained to me.

@dhardy
Copy link
Member Author

dhardy commented Sep 11, 2021

Motivation? I'm looking at #1170, specifically fill_via_chunks, and notice that using a simple copy_nonoverlapping on little-endian is around 50% faster than the copy-by-word-sized-chunks variant (after a pretty significant optimisation to that), for the gen_bytes_chacha* benchmarks. But BE platforms aren't exactly popular, so maybe we just don't care.

@vks
Copy link
Collaborator

vks commented Sep 12, 2021

Ok, so the goal is to offer big-endian platforms better performance at the cost of losing reproducibility with respect to little-endian platforms. I think this is fair, but I would be hesitant to add this without knowing whether anyone will use it, because it increases complexity by adding another feature, and it's a potential footgun for people upgrading rand_core with no-default-features = true.

@dhardy
Copy link
Member Author

dhardy commented Sep 13, 2021

It's also less of a performance boost now (see #1180). Agreed, we don't appear to need this.

@dhardy dhardy closed this as completed Sep 13, 2021
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants