Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

cheerio convert dom to html is not expect #1006

Closed
janryWang opened this issue Apr 19, 2017 · 7 comments
Closed

cheerio convert dom to html is not expect #1006

janryWang opened this issue Apr 19, 2017 · 7 comments

Comments

@janryWang
Copy link

check this

var $ = cheerio.load(`<div class='{"name":"value"}'>hello</div>`)
$.html() 
>>> <div class="{"name":"value"}">hello</div>`
@mehdi-cit
Copy link

Hello @janryWang
I'm having a similar problem. Where you able to find a solution?

@mehdi-cit
Copy link

mehdi-cit commented May 15, 2017

You may want to look as function function formatAttrs(attributes, opts) {
In npm module 'dom-serializer' which is one of cheerio's dependencies. More specifically the line below:
output += key + '="' + (opts.decodeEntities ? entities.encodeXML(value) : value) + '"';

@janryWang
Copy link
Author

@mehdi-cit So, how to solve this problem?

@mehdi-cit
Copy link

mehdi-cit commented Jun 2, 2017

@janryWang
I am no longer using cheerio (for other reasons) so I have not solved the problem per see.
However, I looked into the code and the line I refer to above is where the attribute value gets written within double quotes. So that could be your starting point.
I assume cheerio is one of your dependencies. So you should find it in your own npm_modules folder.
Now, cheerio itself has dependencies (found under its own npm_modules folder). One of these dependencies is
Within cheerio (normally found under your own npm_modules folder), you will see that it too has dependencies 'dom-serializer'. Go inside it and look for the js files (from memory, check index.js first) and locate line output += key + '="' + (opts.decodeEntities ? entities.encodeXML(value) : value) + '"';
You may also use your IDE tools to search for that line among cheerio's dependencies :)
Sorry if I could not be of more help.

@ArmorDarks
Copy link

Same issue here. And quite unfortunate issue...

@Prinzhorn
Copy link

I've been searching through all relevant issues and this one is still open. I'm having the same problem as #85, #319, #338, #720 and #866. There doesn't seem to be a solution that satisfies all of them. That is: create a valid attribute but don't touch Unicode characters.

const cheerio = require('cheerio');

let html = `<html><head></head><body><div data-options='{"itemselector": ".entry"}'>Äpfel</div></body></html>`

console.log(cheerio.load(html).html());
console.log(cheerio.load(html, { decodeEntities: false }).html());

Output

<!--The attribute is valid but Äpfel became &#xC4;pfel-->
<html><head></head><body><div data-options="{&quot;itemselector&quot;: &quot;.entry&quot;}">&#xC4;pfel</div></body></html>

<!--Äpfel are great but the attribute is broken-->
<html><head></head><body><div data-options="{"itemselector": ".entry"}">Äpfel</div></body></html>

I think these to attempted to fix the issue cheeriojs/dom-serializer#33 fb55/entities#28 but didn't make it.

It'd be sick if we can find a solution. And if it's a breaking change, I'd happily install 2.x instead of 1.x.

@fb55
Copy link
Member

fb55 commented Dec 22, 2020

This should be resolved with the latest release!

@cheeriojs cheeriojs locked and limited conversation to collaborators Dec 22, 2020
@cheeriojs cheeriojs unlocked this conversation Dec 22, 2020
@fb55 fb55 closed this as completed Dec 22, 2020
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Development

No branches or pull requests

5 participants