Skip to content

Remove TODOs about using path-slash to handle escapes #1841

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 1 commit into from
Feb 16, 2025

Conversation

EliahKagan
Copy link
Member

@EliahKagan EliahKagan commented Feb 15, 2025

I considered opening an issue or discussion question about this, but I figured it would be more efficient to use a PR, which can be merged or not according to whether I'm correct to think the change these TODOs recommend should not be attempted.

Why I think those TODOs shouldn't be planned

The to_unix_separators and to_windows_separators functions in gix_path::convert had TODO comments saying they should use the path-slash crate "to handle escapes". These comments were added as part of e4f4c4b (#397) but the context there and in the broader related issue #301 does not seem to clarify the basis for this.

It is not really clear what handling escapes would entail here, and it seems like there is not a way to do it without substantially changing the interface of these conversion functions in gix-path, which currently take a single argument representing a path and return a single string-like value also representing a path. If escape sequences appear in the input to such a path conversion function, it would not have a way to know if they are meant literally or as escape sequences. (An analogous concern applies if a function is to add escape sequences in its return value; it would have no way to know if the caller expects them.)

Furthermore, while path-slash can convert some \ paths to use / instead, it does not appear to do anything related to handling escape sequences or distinguishing which occurrences of \ or any other character may be intended as part of an escape sequence. Its documentation does prominently mention that \ in escape sequences should not be converted to /:

On Unix-like OS, the path separator is /. So any conversion is not necessary. But on Windows, the file path separator is \, and needs to be replaced with / for converting the paths to "slash paths". Of course, \s used for escaping characters should not be replaced.

But it looks like the part about \ characters used for escaping is meant as advice on how and when to use path-slash, rather than meaning that path-slash would itself be able to distinguish between \ characters meant as directory separators and \ characters that perform quoting/escaping.

An alternative to removing the TODOs

One possible alternative to removing these TODOs would be to change them to express a plan to switch to path-slash, but for a different reason. If distinguishing \ is something these conversion functions should not do, and that path-slash does not do, then that would still support using path-slash, just for the reason of convenience rather than extra functionality.

However, I think we should probably also not switch to path-slash for this particular use at this time, for the reasons detailed in the part of #1840 (reply in thread) that describes some limitations of path-slash.

The `to_unix_separators` and `to_windows_separators` functions in
`gix_path::convert` had TODO comments saying they should use the
`path-slash` crate "to handle escapes". These comments were added
as part of e4f4c4b (GitoxideLabs#397) but the context there and in the broader
related issue GitoxideLabs#301 does not seem to clarify the basis for this.

It is not really clear what handling escapes would entail here, and
it seems like there is not a way to do it without substantially
changing the interface of these conversion functions in `gix-path`,
which currently take a single argument representing a path and
return a single string-like value also representing a path. If
escape sequences appear in the input to such a path conversion
function, it would not have a way to know if they are meant
literally or as escape sequences. (An analogous concern applies if
a function is to add escape sequences in its return value; it would
have no way to know if the caller expects them.)

Furthermore, while `path-slash` can convert some `\` paths to use
`/` instead, it does not appear to do anything related to handling
escape sequences or distinguishing which occurrences of `\` or any
other character may be intended as part of an escape sequence. Its
documentation (https://docs.rs/path-slash/latest/path_slash/) does
prominently mention that `\` in escape sequences should not be
converted to `/`:

> On Unix-like OS, the path separator is `/`. So any conversion is
> not necessary. But on Windows, the file path separator is `\`,
> and needs to be replaced with `/` for converting the paths to
> "slash paths". Of course, `\`s used for escaping characters
> should not be replaced.

But it looks like the part about `\` characters used for escaping
is meant as advice on how and when to use `path-slash`, rather than
meaning that `path-slash` would itself be able to distinguish
between `\` characters meant as directory separators and `\`
characters that perform quoting/escaping.
@Byron
Copy link
Member

Byron commented Feb 16, 2025

Thanks a lot!

Yes, let's remove the TODOs and wait for a concrete case that shows that a simple find/replace operation won't actually do it.

I considered opening an issue or discussion question about this, but I figured it would be more efficient to use a PR, which can be merged or not according to whether I'm correct to think the change these TODOs recommend should not be attempted.

I love that! In the worst case, a discussion becomes an issue becomes a PR, each time with context being repeated for the sake of completeness of each of these independent units. Instead, if a change is desired, it's usually easier to have a conversation around the change even if it's not fully fleshed out yet or might not even be merged in the end.

@Byron Byron merged commit ad7a94e into GitoxideLabs:main Feb 16, 2025
21 checks passed
@EliahKagan EliahKagan deleted the no-esc-todo branch February 16, 2025 14:25
@EliahKagan
Copy link
Member Author

Thanks. I sometimes struggle with that because a lot of discussions or issues I open could be PRs that might eventually make it into good shape to merge but probably would not--I could imagine a lot of PRs open at a time. I also worry about long-standing PRs dissuading other potential contributors from working on something. (I fear that may have happened in gitpython-developers/GitPython#1955.) But I also agree, at least for the recent #1840/#1842, which should've started out at least as an issue, and maybe even as a PR, since even if it were decided to keep --, a clarifying comment would be in order.

@Byron
Copy link
Member

Byron commented Feb 17, 2025

That's true, I didn't consider the potential long-livedness of such PRs which I am not a huge fan of at least while working more intensively on Gitoxide than I do now. What remains is the consideration if a change is intended, then it seems like a possibility to start with it.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants