Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Should desc() use vec_rank() internally on character vectors? #7045

Open
DavisVaughan opened this issue Jun 26, 2024 · 0 comments
Open

Should desc() use vec_rank() internally on character vectors? #7045

DavisVaughan opened this issue Jun 26, 2024 · 0 comments

Comments

@DavisVaughan
Copy link
Member

See #7044

In particular, note that arrange(df, x) will sort x using the C locale if it is a character vector. But arrange(df, -desc(x)) (i.e. invert the desc() call, giving you the original order in theory) will sort x using the user's locale.

Normally a call like desc(x) is recognized and we don't even actually call desc() under the hood, we translate it to a "desc" value for the directions argument of vec_order_radix(), but in this case the - interferes and we actually evaluate the call.

That ends up calling desc() which does -xtfrm(x), and xtfrm() ends up using base::order(x), utilizing the user's locale.

I don't think we should remove usage of xtfrm() in desc(), since that is a generic that people have probably written S3 methods for, but maybe we can have special behavior for unclassed character vectors where it utilized vec_rank() instead (which uses the C locale)? It would not be a perfect fix, but it may be good enough.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant