-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[flake8-implicit-str-concat
] Normalize octals before merging concatenated strings in single-line-implicit-string-concatenation
(ISC001
)
#13118
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very clever. Good job! Two small points:
- The
.to_string()
call on line 175 will clone the underlying data, which could be quite costly -- and is unnecessary in the happy path, since most strings don't have octal escapes in them. We can avoid this by using aCow
-- something like this? (The diff is relative to your branch)
--- a/crates/ruff_linter/src/rules/flake8_implicit_str_concat/rules/implicit.rs
+++ b/crates/ruff_linter/src/rules/flake8_implicit_str_concat/rules/implicit.rs
@@ -1,3 +1,5 @@
+use std::borrow::Cow;
+
use itertools::Itertools;
use ruff_diagnostics::{Diagnostic, Edit, Fix, FixAvailability, Violation};
@@ -173,11 +175,11 @@ fn concatenate_strings(a_range: TextRange, b_range: TextRange, locator: &Locator
}
let mut a_body =
- a_text[a_leading_quote.len()..a_text.len() - a_trailing_quote.len()].to_string();
+ Cow::Borrowed(&a_text[a_leading_quote.len()..a_text.len() - a_trailing_quote.len()]);
let b_body = &b_text[b_leading_quote.len()..b_text.len() - b_trailing_quote.len()];
if a_leading_quote.find(['r', 'R']).is_none() {
- a_body = normalize_ending_octal(&a_body);
+ normalize_ending_octal(&mut a_body);
}
let concatenation = format!("{a_leading_quote}{a_body}{b_body}{a_trailing_quote}");
@@ -191,10 +193,10 @@ fn concatenate_strings(a_range: TextRange, b_range: TextRange, locator: &Locator
/// Pads an octal at the end of the string
/// to three digits, if necessary.
-fn normalize_ending_octal(text: &str) -> String {
+fn normalize_ending_octal(text: &mut Cow<'_, str>) {
// Early return for short strings
if text.len() < 2 {
- return text.to_string();
+ return;
}
let mut rev_bytes = text.bytes().rev();
@@ -202,20 +204,19 @@ fn normalize_ending_octal(text: &str) -> String {
// "\y" -> "\00y"
if has_odd_consecutive_backslashes(&rev_bytes) {
let prefix = &text[..text.len() - 2];
- return format!("{prefix}\\00{}", last_byte as char);
+ *text = Cow::Owned(format!("{prefix}\\00{}", last_byte as char));
}
// "\xy" -> "\0xy"
- if let Some(penultimate_byte @ b'0'..=b'7') = rev_bytes.next() {
+ else if let Some(penultimate_byte @ b'0'..=b'7') = rev_bytes.next() {
if has_odd_consecutive_backslashes(&rev_bytes) {
let prefix = &text[..text.len() - 3];
- return format!(
+ *text = Cow::Owned(format!(
"{prefix}\\0{}{}",
penultimate_byte as char, last_byte as char
- );
+ ));
}
}
}
- text.to_string()
}
- I wonder if it's necessary to normalize the ending octal in the first string if the second string doesn't start with a digit. E.g. if I understand correctly, something like
"\12" "foo"
will be fixed as"\012foo"
according to the logic in your PR -- but I think it's safe to not apply the normalization logic in this case, and instead fix it as\12foo
? What do you think?
crates/ruff_linter/src/rules/flake8_implicit_str_concat/rules/implicit.rs
Outdated
Show resolved
Hide resolved
Re putting the The borrow checker complained when I tried to implement the fix you suggested (assuming I did it correctly), because I borrow let mut rev_bytes = text.bytes().rev(); and then try to modify it with More fundamentally, I'm hoping you could help with a Rust confusion here. I would've thought that the let concatenation = format!("{a_leading_quote}{a_body}{b_body}{a_trailing_quote}"); Maybe a more minimal version of my question is: Do you gain much in terms or memory/performance by doing let a = &"xyz";
let s = format!("{a}")
// then a drops out of scope vs let a = "xyz".to_string();
let s = format!("{a}")
// then a drops out of scope ? Or does the compiler end up making those roughly the same? (Obviously the first code is better in this contrived example, but the question remains.) Sorry for the long-ish question, and thanks for the very helpful review! |
Ummmm... I'm not entirely sure exactly what optimisations are permitted here! You may well be right that the compiler is smart enough to "see through" the allocation and optimise it away -- but I'm not sure if that's a permitted optimisation, or if it's one the compiler's smart enough to make. I could go and try to investigate exactly whether it does make this optimisation or not -- but I'm not sure the exact optimisations the compiler is likely to make here are things we should be relying on anyway ;) So I think it's better to use a |
I pushed the change I was suggesting to your PR branch in 25805a2 :-) the key to avoiding the borrow-checker complaints is to have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again!
Makes sense, and thank you that was very helpful! |
single-line-implicit-string-concatenation (ISC001)
flake8-implicit-str-concat
] Normalize octals before merging concatenated strings in single-line-implicit-string-concatenation
(ISC001
)
This PR pads the last octal (if there is one) to three digits before concatenating strings in the fix for
ISC001
.For example:
"\12""0"
is fixed to"\0120"
.Closes #12936.