-
Notifications
You must be signed in to change notification settings - Fork 8.5k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Disable the acceptance of C1 control codes by default #11690
Conversation
@check-spelling-bot ReportUnrecognized words, please review:
Previously acknowledged words that are now absentcarlos dpg sid SPACEBAR Unregister Urxvt vcvarsall xIcon yIcon zamoraTo accept these unrecognized words as correct (and remove the previously acknowledged and now absent words), run the following commands... in a clone of the git@github.com:j4james/terminal.git repository
✏️ Contributor please read thisBy default the command suggestion will generate a file named based on your commit. That's generally ok as long as you add the file to your commit. Someone can reorganize it later.
If the listed items are:
See the 🔬 You can test your commits without appending to a PR by creating a new branch with that extra change and pushing it to your fork. The check-spelling action will run in response to your push -- it doesn't require an open pull request. By using such a branch, you can limit the number of typos your peers see you make. 😉 🗜️ If you see a bunch of garbageIf it relates to a ... well-formed patternSee if there's a pattern that would match it. If not, try writing one and adding it to a Patterns are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your lines. Note that patterns can't match multiline strings. binary-ish stringPlease add a file path to the File paths are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your files.
|
ba2d761
to
e17edc5
Compare
1544737
to
5bd12dd
Compare
Thanks for doing this! So far, I've found one app in the wild that uses C1 controls on Windows -- @jdebp's terminal-tests. I've realized two things:
Other things seem to be working: Without but with it, I get: Compared to 7-bit mode: |
That's weird. I can't reproduce that. Do those sequences also fail if you test them individually from the command line with I thought it might be a timing issue related to the change in code page, like maybe the first part of the output is processed before the code page changes to utf-8. That doesn't seem likely, but if you insert some kind of delay in the script after the code page change that might tell us something. |
If nothing else, I just realised while testing this that I got the |
OK I can reproduce this now. I was testing in conhost, but this only happens in Windows Terminal. I'm assuming it's got something to do with conpty passing through the sequences with the C1 controls, but the WT parser on the other end is no longer capable of handling that. I wonder if we could fix this just by enabling C1 support in WT. |
Huh. I wouldn't've expected that, for sure. Good catch! Cheap fix: If we pass through EDIT: Ah, that's probably what you were suggesting. I got it in my head that we'd only either [always enable it] or [request it be enabled via ConPTY on startup.] |
I'm not really sure what the right approach is. I'm going to have to some more experimenting. I'm a little worried now that this is going to break something. It's more complicated than I first thought. |
OK, so I think this is the right approach. If C1 controls are being filtered out, that's going to be handled by the conhost parser and shouldn't ever be passed through to conpty. But if C1 controls are being accepted, then there's the potential for some sequences to be passed through exactly as they were received ( Also note that I've fixed the |
I'm totally comfortable with that outcome, honestly. The biggest issue with C1 controls originates from Windows' unique disposition with codepages, and so it feels right for the default in Windows' console host to be "ignore". Terminal wanted to be slightly less tied to Windows' console's needs in a lot of regards, and... accepting C1 by default surely doesn't seem like the worst of those regards. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works for me. Sorry that this ended up being a kerfluffle.
Hello @DHowett! Because this pull request has the p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (
|
Thanks again, James! Well-thought-out and well-implemented as always. |
🎉 Handy links: |
Conhost/Terminal supported C1 control sequences for a while... but then that apparently caused some problems. So they disabled them by default, which causes all our VT SGR sequences to be broken (so instead of pretty color output, you just see gray output, with strange numbers sprinkled all over). Fortunately they provided a way to turn them back on. Related: microsoft/terminal#11690 Related: microsoft/terminal#10310 Note that I believe you need a relatively recent build for this change to have effect.
When we added support for the `DECAC1` control sequence, which determines whether `C1` controls are accepted or not, the intention was that conhost would be making that determination, and Windows Terminal would always be expected to accept any passed-through `C1` controls. However, this didn't take into account that a passed-through `RIS` sequence could end up disabling `DECAC1`, and that would leave Windows Terminal incapable of processing any `C1` controls. This PR attempts to fix that oversight. The `DECAC1` sequence was added in PR #11690, when we disabled `C1` acceptance by default. This is a bit of a hack, but I've added a new `AlwaysAcceptC1` mode to the state machine, which is enabled at startup in the Terminal, and is never disabled. The parser then just needs to check whether either `AcceptC1` or `AlwaysAcceptC1` are set. ## Validation Steps Performed I've manually confirmed the test case in #13968 now works as expected. Closes #13968
When we added support for the `DECAC1` control sequence, which determines whether `C1` controls are accepted or not, the intention was that conhost would be making that determination, and Windows Terminal would always be expected to accept any passed-through `C1` controls. However, this didn't take into account that a passed-through `RIS` sequence could end up disabling `DECAC1`, and that would leave Windows Terminal incapable of processing any `C1` controls. This PR attempts to fix that oversight. The `DECAC1` sequence was added in PR #11690, when we disabled `C1` acceptance by default. This is a bit of a hack, but I've added a new `AlwaysAcceptC1` mode to the state machine, which is enabled at startup in the Terminal, and is never disabled. The parser then just needs to check whether either `AcceptC1` or `AlwaysAcceptC1` are set. ## Validation Steps Performed I've manually confirmed the test case in #13968 now works as expected. Closes #13968 (cherry picked from commit f2b361c) Service-Card-Id: 87207769 Service-Version: 1.15
When we added support for the `DECAC1` control sequence, which determines whether `C1` controls are accepted or not, the intention was that conhost would be making that determination, and Windows Terminal would always be expected to accept any passed-through `C1` controls. However, this didn't take into account that a passed-through `RIS` sequence could end up disabling `DECAC1`, and that would leave Windows Terminal incapable of processing any `C1` controls. This PR attempts to fix that oversight. The `DECAC1` sequence was added in PR #11690, when we disabled `C1` acceptance by default. This is a bit of a hack, but I've added a new `AlwaysAcceptC1` mode to the state machine, which is enabled at startup in the Terminal, and is never disabled. The parser then just needs to check whether either `AcceptC1` or `AlwaysAcceptC1` are set. ## Validation Steps Performed I've manually confirmed the test case in #13968 now works as expected. Closes #13968 (cherry picked from commit f2b361c) Service-Card-Id: 87207767 Service-Version: 1.16
There are some code pages with "unmapped" code points in the C1 range,
which results in them being translated into Unicode C1 control codes,
even though that is not their intended use. To avoid having these
characters triggering unintentional escape sequences, this PR now
disables C1 controls by default.
Switching to ISO-2022 encoding will re-enable them, though, since that
is the most likely scenario in which they would be required. They can
also be explicitly enabled, even in UTF-8 mode, with the
DECAC1
escapesequence.
What I've done is add a new mode to the
StateMachine
class thatcontrols whether C1 code points are interpreted as control characters or
not. When disabled, these code points are simply dropped from the
output, similar to the way a
NUL
is interpreted.This isn't exactly the way they were handled in the v1 console (which I
think replaces them with the font notdef glyph), but it matches the
XTerm behavior, which seems more appropriate considering this is in VT
mode. And it's worth noting that Windows Explorer seems to work the same
way.
As mentioned above, the mode can be enabled by designating the ISO-2022
coding system with a
DOCS
sequence, and it will be disabled again whenUTF-8 is designated. You can also enable it explicitly with a
DECAC1
sequence (originally this was actually a DEC printer sequence, but it
doesn't seem unreasonable to use it in a terminal).
I've also extended the operations that save and restore "cursor state"
(e.g.
DECSC
andDECRC
) to include the state of the C1 parser mode,since it's closely tied to the code page and character sets which are
also saved there. Similarly, when a
DECSTR
sequence resets the codepage and character sets, I've now made it reset the C1 mode as well.
I should note that the new
StateMachine
mode is controlled via ageneric
SetParserMode
method (with a matching API in theConGetSet
interface) to allow for easier addition of other modes in the future.
And I've reimplemented the existing ANSI/VT52 mode in terms of these
generic methods instead of it having to have its own separate APIs.
Validation Steps Performed
Some of the unit tests for OSC sequences were using a C1
0x9C
for thestring terminator, which doesn't work by default anymore. Since that's
not a good practice anyway, I thought it best to change those to a
standard 7-bit terminator. However, in tests that were explicitly
validating the C1 controls, I've just enabled the C1 parser mode at the
start of the tests in order to get them working again.
There were also some ANSI mode adapter tests that had to be updated to
account for the fact that it has now been reimplemented in terms of the
SetParserMode
API.I've added a new state machine test to validate the changes in behavior
when the C1 parser mode is enabled or disabled. And I've added an
adapter test to verify that the
DesignateCodingSystems
andAcceptC1Controls
methods toggle the C1 parser mode as expected.I've manually verified the test cases in #10069 and #10310 to confirm
that they're no longer triggering control sequences by default.
Although, as I explained above, the C1 code points are completely
dropped from the output rather than displayed as notdef glyphs. I
think this is a reasonable compromise though.
Closes #10069
Closes #10310