Skip to content

Incorrect underline when format strings contain escaped characters #55155

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
martin-t opened this issue Oct 17, 2018 · 5 comments · Fixed by #57140
Closed

Incorrect underline when format strings contain escaped characters #55155

martin-t opened this issue Oct 17, 2018 · 5 comments · Fixed by #57140
Labels
A-diagnostics Area: Messages for errors, warnings, and lints C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@martin-t
Copy link

Playground

Output:

error: 1 positional argument in format string, but no arguments were given
 --> src/main.rs:2:16
  |
2 |     println!("\t{}");
  |                ^^

It seems the underline offset doesn't take into account that some characters like "\n\t\r" are wider in the source code.

@XAMPPRocky XAMPPRocky added A-diagnostics Area: Messages for errors, warnings, and lints T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. C-bug Category: This is a bug. labels Oct 17, 2018
@fhartwig
Copy link
Contributor

I'm working on a fix for this, but it might take some time because compiling rustc on my laptop takes ages.

@fhartwig
Copy link
Contributor

Sorry, it looks like I jumped the gun, I actually got stuck on this. The problem is that the code computing the spans only has access to the string after processing the escape sequences (which happens in the lexer) and we cannot get the correct offsets from that (we can't just map e.g. tabs to "\t" because the original string might have used a "\x"-style escape sequence instead. Or, for that matter, any regular character could have been expressed as an escape sequence, which would mess up the offsets).
For the offsets to be correct, the format string parser would have to operate on the unprocessed string, but I don't know how to access that and I'm not sure that would really work anyway.

@estebank It looks to me like you introduced this functionality, do you have any ideas how this could be fixed?

@estebank
Copy link
Contributor

@fhartwig I believe we could pass around both the original and cooked strings around, and perform the span offset logic on the original, but I haven't looked at the code to see how hard that would be to do. I think that's the least hacky option.

@fhartwig
Copy link
Contributor

Yes, that seems to be the most plausible approach. One issue that I see with that is that the format characters themselves can be represented as escape sequences (i.e. println!("\x7B\x7D", 1); is perfectly valid rust). I get the impression that the only "proper" fix for this would be to have the lexer return the necessary information to map all characters in the processed string to their input equivalents, but that seems pretty excessive.

@estebank
Copy link
Contributor

@fhartwig indeed. I guess an option would be to have a fast-path parse where all of the current string parsing is done, and re-parse it with extra annotations if we encounter a problem. Having the diagnostics path be slower than the happy path is perfectly ok from my point of view.

estebank added a commit to estebank/rust that referenced this issue Dec 27, 2018
When a format string has escaped whitespace characters format
arguments were shifted by one per each escaped character. Account
for these escaped characters when synthesizing the spans.

Fix rust-lang#55155.
bors added a commit that referenced this issue Dec 29, 2018
Tweaks to format string diagnostics

Add label spans and fix incorrect spans.

Fix #55155, fix #55350.
JohnHeitmann pushed a commit to JohnHeitmann/rust that referenced this issue Jan 5, 2019
When a format string has escaped whitespace characters format
arguments were shifted by one per each escaped character. Account
for these escaped characters when synthesizing the spans.

Fix rust-lang#55155.
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
A-diagnostics Area: Messages for errors, warnings, and lints C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants