-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
bin:encode-string - should the result have a BOM? #1751
Comments
It's quite hard here to cater for all the possibilities. We should allow the user to specify byte order ( |
I was surprised to observe that Java’s As the BOM inclusion is not part of the spec, I would drop it, and optionally make it available via explicit options. |
I'm inclined to say: (a) Encodings UTF-16BE and UTF-16LE should be recognised, and UTF-16 on its own should be assumed to mean UTF-16BE. (b) On reading, a BOM if present is decoded and returned like any other character (c) On writing, we never write a BOM unless included in the data to be written (it's easy enough to write char(0xFEFF)). (d) We provide a function read-BOM() which examines the start of the input and if a BOM is present returns (as a map) (a) the inferred encoding of the data, and (b) the offset at which the real data starts (ie the length of the BOM in octets). |
That seems reasonable to me. If I pass a string that explicitly begins with the BOM, I guess that's what I want encoded. |
Test cases in the EXPath test suite using
bin:encode-string
with encoding=utf-16 include a BOM at the start of the output, but the spec says nothing about this. It's probably useful for some use case but a nuisance for others.The text was updated successfully, but these errors were encountered: