Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Use Safe Parsers in lxml Parsing Functions #27

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

pixeebot[bot]
Copy link

@pixeebot pixeebot bot commented May 31, 2024

This codemod sets the parser parameter in calls to lxml.etree.parse and lxml.etree.fromstring if omitted or set to None (the default value). Unfortunately, the default parser=None means lxml will rely on an unsafe parser, making your code potentially vulnerable to entity expansion attacks and external entity (XXE) attacks.

The changes look as follows:

  import lxml.etree
- lxml.etree.parse("path_to_file")
- lxml.etree.fromstring("xml_str")
+ lxml.etree.parse("path_to_file", parser=lxml.etree.XMLParser(resolve_entities=False))
+ lxml.etree.fromstring("xml_str", parser=lxml.etree.XMLParser(resolve_entities=False))
More reading

I have additional improvements ready for this repo! If you want to see them, leave the comment:

@pixeebot next

... and I will open a new PR right away!

🧚🤖 Powered by Pixeebot

💬Feedback | 👥Community | 📚Docs | Codemod ID: pixee:python/safe-lxml-parsing

Copy link
Author

pixeebot bot commented Jun 8, 2024

I'm confident in this change, and the CI checks pass, too!

If you see any reason not to merge this, or you have suggestions for improvements, please let me know!

Copy link
Author

pixeebot bot commented Jun 9, 2024

Just a friendly ping to remind you about this change. If there are concerns about it, we'd love to hear about them!

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants