Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Replace cheerio with one dependency (parse5) #5400

Open
Tracked by #3216
fregante opened this issue Aug 5, 2024 · 0 comments
Open
Tracked by #3216

Replace cheerio with one dependency (parse5) #5400

fregante opened this issue Aug 5, 2024 · 0 comments

Comments

@fregante
Copy link
Contributor

fregante commented Aug 5, 2024

addons-linter uses cheerio to validate the HTML, currently there are two rules1 and cheerio is unreasonably large just for that (2.80MB!)2.

Cheerio wraps parse5 to parse the AST and addons-linter could use it directly, saving almost 2 megabytes.3 The issue is that parse5 only returns an object4, no querying ability.

💡 I'm not sure whether this is actually possible, but would you be open to a PR that passes current tests?

Given the number of rules (2!) would it make sense to try and walk the AST directly? Maybe a simple walker would be as simple as:

function walk(dom, filter) {
	if (filter(dom)) {
		return dom;
	}

	if (!dom.childNodes) {
		return;
	}

	for (const node of dom.childNodes) {
		const result = walk(node);
		if (result) {
			return result
		}
	}
}

And then the rules would look something like:

const invalid = walk(parse5(source), node => node.tagName === 'script' && node.attrs.some(attr => attr.type === 'etc etc'))
if (invalid) {
	rules.push(invalid, ...details)
}

Footnotes

  1. https://github.com/mozilla/addons-linter/tree/5c608c102ed926099f6ed0a6b5d53c93ff9c8ae3/src/rules/html

  2. https://packagephobia.com/result?p=cheerio

  3. https://packagephobia.com/result?p=parse5@7.1.2

  4. https://astexplorer.net/#/1CHlCXc4n4

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant