Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Don't add Unicode BOM by default #712

Closed
zjturner opened this issue Aug 10, 2015 · 5 comments
Closed

Don't add Unicode BOM by default #712

zjturner opened this issue Aug 10, 2015 · 5 comments
Assignees
Labels
Milestone

Comments

@zjturner
Copy link

PTVS 2.2
Visual Studio 2013

File -> New -> File
Choose Python -> Python Class

type a single word consisting of only ascii characters and hit Save. The UTF-8 BOM is added.

This does not happen for any other types of files. Notice that if, instead of hitting Save in the final step, you go to File -> Advanced Save Options, you can see that the default encoding is Unicode UTF-8 (with signature). But for other types of files including .cpp files, it defaults to Western European.

This also didn't happen prior to PTVS 2.2 either, so it appears to be a regression.

This is a really big pain point as we have to manually adjust this every time we save a file to get the BOM out.

@zooba
Copy link
Member

zooba commented Aug 11, 2015

This also didn't happen prior to PTVS 2.2 either, so it appears to be a regression

You're right, and I know what it is too.

We switched the default encoding from ASCII (which breaks basically anyone who isn't American) to UTF-8 (which adds a BOM by default because that's what .NET does). Since Python 3 uses UTF-8 by default, none of the other encodings make sense, but the BOM is certainly an issue.

I suspect we'll need to provide our own custom encoding that does UTF-8 with no BOM. If it's as simple as I think it might be, we should do this in 2.2.1 because it is a regression.

@zooba zooba added this to the 2.2.1 milestone Aug 11, 2015
@zooba zooba self-assigned this Aug 11, 2015
zooba added a commit that referenced this issue Aug 11, 2015
Fixes #712: Don't add Unicode BOM by default
@zjturner
Copy link
Author

zjturner commented Sep 1, 2015

This still happens to me with the latest dev build from 4 days ago. Is this supposed to be fixed? I can't test the official 2.2.1 release because there is no download for VS 2013.

@zooba
Copy link
Member

zooba commented Sep 1, 2015

There's no official 2.2.1 release - that's 2.1.1 (annoyingly similar, I admit) and it doesn't have this fix.

The change was to the detection, so if your file is starting with a BOM then it will keep it. Are you sure we're adding it to files that previously don't have one? (We may be adding it to files that need it too - not sure we can override that behavior as easily...)

@zjturner
Copy link
Author

zjturner commented Sep 1, 2015

I tried it on a few more files and those automatically defaulted to the correct encoding (UTF-8 without signature). Perhaps there was a file that accidentally got a BOM written into it and it was just keeping that, as you suggested. I'll keep an eye out for if the problem persists.

@M0sc
Copy link

M0sc commented Mar 14, 2016

Zooba wrote:

We switched the default encoding from ASCII (which breaks basically anyone who isn't American) to UTF-8 (which adds a BOM by default because that's what .NET does).

Actually your old implementation adds BOM, not .NET core:

c0d2d63
https://github.com/dotnet/coreclr/blob/master/src/mscorlib/src/System/Text/Encoding.cs#L1542
https://github.com/dotnet/coreclr/blob/master/src/mscorlib/src/System/Text/UTF8Encoding.cs#L69

Seems that the new implementation

new UTF8Encoding(false)

have been chosen for clarity as

new UTF8Encoding()

produced the same result.
😉

And this annoying issue exists in version 2.1.1 (build 2.1.30810.00):
https://github.com/Microsoft/PTVS/blob/v2.1/Python/Product/Analysis/Parsing/Parser.cs#L4689
😠

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants