Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

util.url doesn't work for Beyoncé etc. (utf encoding) #147

Closed
johnelliott opened this issue May 1, 2017 · 3 comments
Closed

util.url doesn't work for Beyoncé etc. (utf encoding) #147

johnelliott opened this issue May 1, 2017 · 3 comments

Comments

@johnelliott
Copy link

johnelliott commented May 1, 2017

The URL function is giving us issues with image file names where there are UTF characters like cedilla and grave accents.

Problem:
The current function results in Beyonc%327e%301_kofryc, where the c in Beyonc is untouched. This URIDecodes to Beyonc27e01_kofryc, which isn't the correct public_id.

The desired end url should contain a public_id something likeBeyonc%CC%A7e%CC%81_kofryc which you can URIDecode back into the original public_id.

Steps to reproduce:
Try with a public_id like Beyonçé, and stop the debugger as it's looping over the characters in the string: https://github.com/cloudinary/cloudinary_npm/blob/master/src/utils.coffee#L598

It seems that because the regex doesn't match the c of the ç we're trying to represent, by the time you process the cedilla, the chance for encoding the c and the cedilla together into %CC%A7 has passed:
image

Normalizing the string before sending it to the function doesn't appear to fix the issue.

Helpful links about unicode normalization:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/normalize
http://www.unicode.org/reports/tr15/tr15-29.html

@johnelliott johnelliott changed the title util.url doesn't work for Beyoncé etc. util.url doesn't work for Beyoncé etc. (utf encoding) May 1, 2017
@taragano
Copy link
Collaborator

taragano commented May 1, 2017

Hi @johnelliott,

Thank you for reporting about this.

It seems that there are two different characters that are used here (although visually they look identical), one of them is handled correctly, the other isn't:

> cloudinary.utils.url("Beyonçé")     // (Beyon%C3%A7%C3%A9)
'http://res.cloudinary.com/demo/image/upload/Beyon%C3%A7%C3%A9'
> cloudinary.utils.url("Beyon%C3%A7%C3%A9")
'http://res.cloudinary.com/demo/image/upload/Beyon%C3%A7%C3%A9'

> cloudinary.utils.url("Beyonçé")      // (Beyonc%CC%A7e%CC%81)
'http://res.cloudinary.com/demo/image/upload/Beyonc%CC%A7e%CC%81'
> cloudinary.utils.url("Beyonc%CC%A7e%CC%81")
'http://res.cloudinary.com/demo/image/upload/Beyonc%CC%A7e%CC%81'

I've passed it on to our team to have a look at it.

@nadavofi
Copy link
Contributor

nadavofi commented May 3, 2017

Hi @johnelliott,

Could you please make sure you're using the latest release of this library?
A similar issue has been fixed a while ago (see: #138) with the release of v1.9.0

I look forward to your updates.

@johnelliott
Copy link
Author

@nadavofi Thank you :) We'll compare this code and see if that helps us out.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants