Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

How to read and write OsPaths without interpreting them? #233

Open
jefdaj opened this issue Jun 14, 2024 · 3 comments
Open

How to read and write OsPaths without interpreting them? #233

jefdaj opened this issue Jun 14, 2024 · 3 comments

Comments

@jefdaj
Copy link

jefdaj commented Jun 14, 2024

I'm trying to read and write lists of OsPaths (actually just PosixPaths in case that matters) to files. I want to avoid doing any conversion or interpretation if possible---just treat the paths as opaque bytestrings separated by \NUL.

I see that I could use encodeFS and decodeFS, but 1) that's incompatible with Attoparsec (annoyingly, the Parser monad isn't a transformer), 2) it forces IO into a lot of otherwise pure code, and 3) the extra round-trip seems more likely to introduce encoding bugs than prevent them.

I'm about to try breaking into the hidden modules and using the raw constructors. But is there a more recommended way to read/write PosixPaths?

One idea that comes to mind is adding a Binary/Bytable instance? I haven't looked into that before. But a trivial instance that just wraps/unwraps the constructor seems like it would be equivalent to exposing the constructor itself.

Edit: also, thanks for taking on this OsPath thing! I'm not well versed in low level encodings and am glad someone is working on it. I would offer to help to the extent I can without breaking anything. I'm working on Arbitrary instances to check that my code can round-trip trees of OsPaths to folders on disk. Maybe a version of those could end up in the library and help identify bugs?

@jefdaj
Copy link
Author

jefdaj commented Jun 14, 2024

Of course after posting this, I finally noticed you can access the raw constructors in the OsString package! Is that what I should be doing?

@hasufell
Copy link
Member

From what I understand you want to write filepaths to a file on disk?

Indeed I would avoid decodeFS. How to access the raw bytes in a cross platform manner is described here: https://hasufell.github.io/posts/2022-06-29-fixing-haskell-filepaths.html#accessing-the-raw-bytes-in-a-cross-platform-manner

I haven't looked into that before. But a trivial instance that just wraps/unwraps the constructor seems like it would be equivalent to exposing the constructor itself.

The problem is that we are dealing with wide char array on windows ([Word16]) as opposed to char array on unix ([Word8]). So you'd still somehow need to encode the platform information (maybe as a magic bit?) for OsPath. Binary instances for PosixPath and WindowsPath are indeed trivial. So if you're just dealing with PosixPath, you can unwrap the underlying ShortByteString and turn it into a ByteString.

Wrt attoparsec, also see haskell/attoparsec#225

My idea was to provide a way to convert to Data.Bytes.Bytes (which is a sliceable type) and then use that for efficient parsing. But we still have the problem that on Windows we are dealing with wide char arrays.

Of course after posting this, I finally noticed you can access the raw constructors in the OsString package! Is that what I should be doing?

Yes

@hasufell
Copy link
Member

Related: #161

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

2 participants