Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

fix: Use different chars for synthetic paths. #208

Conversation

cfraenkel
Copy link
Contributor

this fixes --exec, --tla-str and --ext-str on windows as < and > are illegal file name characters.

fixes #200

this fixes --exec, --tla-str and --ext-str on windows as < and > are
illegal file name characters.

fixes databricks#200
@stephenamar-db stephenamar-db merged commit 9e975e9 into databricks:master Nov 5, 2024
@cfraenkel cfraenkel deleted the bugfix/fix-exec-and-tla-on-windows branch November 6, 2024 09:33
Copy link
Contributor

@JoshRosen JoshRosen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR appears to have broken sjsonnet for me in an Ubuntu environment: when testing out a new sjsonnet build containing this PR I received the following error:

0> java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters: ?tla-var hcvaultContext?
0> 	at java.base/sun.nio.fs.UnixPath.encode(UnixPath.java:121)
0> 	at java.base/sun.nio.fs.UnixPath.<init>(UnixPath.java:68)
0> 	at java.base/sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:279)
0> 	at java.base/java.nio.file.Path.resolve(Path.java:515)
0> 	at os.Path.$div(Path.scala:440)
0> 	at sjsonnet.OsPath.$div(OsPath.scala:11)
0> 	at sjsonnet.Interpreter.parseVar(Interpreter.scala:43)
0> 	at sjsonnet.Interpreter.$anonfun$evaluate$5(Interpreter.scala:99)
0> 	at sjsonnet.Interpreter.$anonfun$evaluate$5$adapted(Interpreter.scala:98)
0> 	at scala.Option.foreach(Option.scala:437)
0> 	at sjsonnet.Interpreter.$anonfun$evaluate$4(Interpreter.scala:98)
0> 	at scala.util.Either.map(Either.scala:382)
0> 	at sjsonnet.Interpreter.$anonfun$evaluate$2(Interpreter.scala:90)
0> 	at scala.util.Either.flatMap(Either.scala:352)
0> 	at sjsonnet.Interpreter.evaluate(Interpreter.scala:88)
0> 	at sjsonnet.Interpreter.interpret0(Interpreter.scala:72)
0> 	at sjsonnet.SjsonnetMain$.$anonfun$renderNormal$1(SjsonnetMain.scala:139)
0> 	at sjsonnet.SjsonnetMain$.$anonfun$writeToFile$3(SjsonnetMain.scala:127)
0> 	at scala.util.Either.flatMap(Either.scala:352)
0> 	at sjsonnet.SjsonnetMain$.$anonfun$main0$30(SjsonnetMain.scala:123)
0> 	at scala.util.Either.flatMap(Either.scala:352)
0> 	at sjsonnet.SjsonnetMain$.$anonfun$main0$29(SjsonnetMain.scala:71)
0> 	at scala.util.Either.flatMap(Either.scala:352)
0> 	at sjsonnet.SjsonnetMain$.main0(SjsonnetMain.scala:66)

I think the problem is that the chosen replacement characters, U+FE64 Small Less-Than Sign and U+FE65 Small Greater-Than Sign, are non-ASCIII and this is triggering problems if LANG is set to C or ASCII or similar, as is often the default on certain platforms (see https://stackoverflow.com/a/39843086).

I understand why we don't want to use plain ASCII < or > because they are forbidden in many Windows filesystem path names.

Maybe we should choose a different ASCII character here, such as [ and ]? These are technically not POSIX portable filename characters but neither were < and > and those worked fine everywhere except Windows.

@JoshRosen
Copy link
Contributor

Maybe we should choose a different ASCII character here, such as [ and ]?

I thought of a slightly cleaner fix with no change to the visual presentation to end users and no impact on non-Windows users: just flag the Unicode usage to only be done on Windows. I'll submit a PR for this shortly.

@JoshRosen
Copy link
Contributor

I've submitted a fix here: #215

stephenamar-db pushed a commit that referenced this pull request Nov 26, 2024
… to be a Windows-only change (#215)

This PR fixes a regression introduced by
#208

That PR aimed to fix #200,
an issue where `std.extVar` failed on Windows because it tried to
resolve mock paths which contained `<` and `>` characters which are
forbidden in Windows filenames.

That PR's solution was to use Unicode less-than and greater-than
characters in place of ASCII `<` and `>`, but that broke things for
Unix/Linux platforms with a non-UTF8 `LANG` (see
#208 (review)).

In this PR, I aim to fix this by flagging the other PR's change to only
occur for Windows, while continuing to use ASCII `<` and `>` as before
on other platforms.

I also added a regression test by pinning `LANG=C` in our GitHub Actions
test setup. This successfully reproduced the bug fixed in this PR, and
also uncovered a minor test-only issue related to the use of default
character encodings (which I've fixed by pinning the test code to
UTF-8).

Note that these file paths aren't actually materialized onto disk:
rather, these are "placeholder / mock" paths used to indicate when
jsonnet is reading from synthetic paths. In theory, we might be able to
avoid the need to ever perform resolution in the first place by changing
uses of these paths to be an `Either` of either an actual `os.Path` or a
placeholder, but that's a much larger and more invasive change than I
want to make right now. Here, I've chosen to pursue a narrowly-targeted
tactical fix-forward.
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

std.extVar are resolved as FilePath in windows
3 participants