-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Current ~str UTF-8 behavior allows for denial-of-service attack with args, environ #7188
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Comments
This is a plausible default handler. We don't presently support default handlers, but I might be ok with it if/when we do. |
AFAICT, using a condition handler lets you replace the string wholesale, not replace the offending character. What sort of default handler would we use to let clients figure out the difference between a replaced string, and one that was, say, empty to begin with? |
You can easily change the signature of the condition to use an enum that expresses these differences, if they matter. |
That doesn't help if you rely on the default condition handler though, because in the end it just replaces the bad string with another one. |
Isn't it entirely incorrect to assume that the arguments and the environment are UTF-8 encoded? I.e. these two functions should return |
I don't think this is really an issue in the |
In general we have no or only poorly sketched interfaces for accepting other encodings presently. it would be good to grow some. |
nominating well-covered milestone |
Just a bug, declining |
Fixing this is presumably going to require changing API (to provide byte vectors as the primary value type and perhaps a secondary method that gives Option<~str>). |
Apparently even an empty program (eg., At least |
IMO |
… r=xFrednet,flip1995 `needless_collect` enhancements fixes rust-lang#7164 changelog: `needless_collect`: For `BTreeMap` and `HashMap` lint only `is_empty`, as `len` might produce different results than iter's `count` changelog: `needless_collect`: Lint `LinkedList` and `BinaryHeap` in direct usage case as well
The current behavior of
~str
is that it unilaterally rejects any invalid UTF-8 sequence (modulo #3787). Unfortunately, this opens up rust programs to denial-of-service attacks where maliciously crafted user input can cause unexpected task failure. Two cases that exist right now are invalid UTF-8 in the args list and in the environment. The mere presence of the invalid UTF-8 will causeos::args()
andos::env()
to immediately raise thestr::not_utf8
condition, which is unlikely to be handled by callers of these functions.I've suggested this before on the IRC channel, but I think it's worth suggesting again, that when parsing UTF-8 we should consider simply translating the first byte of any invalid sequence into the Replacement Character (U+FFFD) instead of failing outright.
The text was updated successfully, but these errors were encountered: