Skip to content

Support for scripts with unicode content #1389

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 3 commits into from
Aug 17, 2018
Merged

Conversation

expobrain
Copy link
Contributor

@expobrain expobrain commented Jun 16, 2018

Summary of changes

Makes the _to_ascii() function able to handle script's contents in unicode format.

Closes #761

Pull Request Checklist

  • Changes have tests
  • News fragment added in changelog.d

@@ -1 +1,2 @@
In package_index, fixed handling of encoded entities in URLs.
Scripts which have unicode content are now sopported
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be in its own changelog file, changelog.d/1389.change.rst

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also s/sopported/supported

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -108,7 +108,7 @@ def isascii(s):
else:

def _to_ascii(s):
return s.encode('ascii')
return s.encode('utf8')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... Looking at what this does I think I agree with this change (though I don't know nearly enough about unicode issues to fully judge it), but maybe we should also change _to_ascii to _to_bytes?

Also, this change definitely needs tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean having only the _to_bytes() function?

BTW, tests added

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well just _to_ascii sounds like it turns something to an encoded ASCII string, but this is actually returning a utf-8-encoded byte string, so it should probably be called _to_bytes() instead of _to_ascii().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@expobrain
Copy link
Contributor Author

@pganssle fixed

Copy link
Member

@pganssle pganssle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have rebased this and cleaned up the history a bit. Will merge when CI passes.

@pganssle pganssle merged commit afba2d8 into pypa:master Aug 17, 2018
@expobrain expobrain deleted the scripts_ascii branch August 17, 2018 15:03
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants