-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Fuzzing reveals a number of parse errors #568
Comments
Another such error: clusterfuzz-testcase-minimized-bs4_fuzzer-6401239223762944 Markup: Same |
This one is different from the rest: Markup: Raises an
|
I'm the lead developer of Beautiful Soup, which has html5lib as an optional dependency. Over the past couple of years I've gotten a number of notifications from Google's oss-fuzz project about unhandled exceptions that actually turned out to be problems in html5lib. There wasn't much I could do with these errors, but now that it looks like html5lib maintenance is picking up, I can pass them on to you. (Sorry. 😿)
I've incorporated the fuzz reports into the Beautiful Soup test suite, and the test cases themselves are here, but here's a general picture of what problems I see. In each case, I believe just parsing the bad markup is enough to trigger the error.
clusterfuzz-testcase-minimized-bs4_fuzzer-4999465949331456
Markup:
b')<a><math><TR><a><mI><a><p><a>'
Error:
clusterfuzz-testcase-minimized-bs4_fuzzer-5843991618256896
Markup:
b'-<math><sElect><mi><sElect><sElect>'
Error:
clusterfuzz-testcase-minimized-bs4_fuzzer-6241471367348224
Markup:
b'ñ<table><svg><html>'
Error:
clusterfuzz-testcase-minimized-bs4_fuzzer-6600557255327744
Markup:
b'\t<TABLE><<!>;<!><<!>.<lec><th>i><a><mat\x00\x01<mi\x00a><math>><th><mI>chardeta\xff\xff\xff\xff<><th><mI><||||||||A<select><>qu?\xbemath><th><mie>qu'
Error:
Also reported to me recently was the issue that was reported to you as issue #557.
The text was updated successfully, but these errors were encountered: