-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Support for scripts with unicode content #1389
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Conversation
changelog.d/1366.change.rst
Outdated
@@ -1 +1,2 @@ | |||
In package_index, fixed handling of encoded entities in URLs. | |||
Scripts which have unicode content are now sopported |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be in its own changelog file, changelog.d/1389.change.rst
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also s/sopported/supported
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
setuptools/command/easy_install.py
Outdated
@@ -108,7 +108,7 @@ def isascii(s): | |||
else: | |||
|
|||
def _to_ascii(s): | |||
return s.encode('ascii') | |||
return s.encode('utf8') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm... Looking at what this does I think I agree with this change (though I don't know nearly enough about unicode issues to fully judge it), but maybe we should also change _to_ascii
to _to_bytes
?
Also, this change definitely needs tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean having only the _to_bytes()
function?
BTW, tests added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well just _to_ascii
sounds like it turns something to an encoded ASCII string, but this is actually returning a utf-8-encoded byte string, so it should probably be called _to_bytes()
instead of _to_ascii()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@pganssle fixed |
e239095
to
d94437c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have rebased this and cleaned up the history a bit. Will merge when CI passes.
d94437c
to
c43d0f6
Compare
This also renames the _to_ascii function to better reflect its purpose.
Summary of changes
Makes the
_to_ascii()
function able to handle script's contents in unicode format.Closes #761
Pull Request Checklist