-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Avoid frequent string allocations in Decoder
#37022
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Conversation
r? @sfackler (rust_highfive has picked a reviewer for you, use r? to override) |
I think it would be easier and more practical to have |
@@ -2170,7 +2168,7 @@ impl ::Decoder for Decoder { | |||
} | |||
|
|||
fn read_char(&mut self) -> DecodeResult<char> { | |||
let s = self.read_str()?; | |||
let s = self.read_str()?; // njn: use read_str_slice() instead? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
json::Decoder::read_str()
just unwraps an already-allocated Json::String(String)
.
Storing the pub struct Decoder {
stack: Vec<Json>,
str: Option<String>,
}
impl Decoder {
fn read_str_slice(&mut self) -> DecodeResult<&str> {
match self.pop() {
Json::String(v) => {
self.str = Some(v);
Ok(self.str.as_ref().unwrap())
}
// ...
}
}
} |
@sinkuu: thank you for the suggestion! I had already tried something similar, except that I used a I confess I don't understand how your |
I believe that while you have the (returned) reference to the Option's contents, you cannot call read_str_slice again, since that would invalidate the reference. I might be wrong though... |
Huh, ok. I had been thinking I would have to add a reference to the original string being read to |
3be154d
to
a9494f5
Compare
I updated the code to use @sinkuu's suggestion, and I also removed Here are the rustc-benchmarks. It's a 0.5%--1.5% speed-up on many of them. stage1 compiler doing debug builds:
stage2 compiler doing debug builds
|
cc @eddyb |
iMO, this should use |
`opaque::Decoder::read_str` is very hot within `rustc` due to its use in the reading of crate metadata, and it currently returns a `String`. This commit changes it to instead return a `Cow<str>`, which avoids a heap allocation. This change reduces the number of calls to `malloc` by almost 10% in some benchmarks. This is a [breaking-change] to libserialize.
a9494f5
to
8988a11
Compare
r? @eddyb -- I updated to return |
expect!(self.pop(), String) | ||
fn read_str(&mut self) -> DecodeResult<Cow<str>> { | ||
match self.pop() { | ||
Json::String(v) => Ok(Cow::Owned(v)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't you do expect!(self.pop(), String).map(Cow::Owned)
?
@@ -566,7 +566,7 @@ impl PartialEq<InternedString> for str { | |||
|
|||
impl Decodable for InternedString { | |||
fn decode<D: Decoder>(d: &mut D) -> Result<InternedString, D::Error> { | |||
Ok(intern(d.read_str()?.as_ref()).as_str()) | |||
Ok(intern(d.read_str()?.deref()).as_str()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A &
before d
is better IMO.
The single biggest cause of allocations in the compiler is this function:
This is called a lot while reading crate metadata, and for some benchmarks it's
the cause of ~10% of all calls to malloc(). And it's not really necessary --
d.read_str()
creates a string which we immediately convert to a&str
withas_ref
.So I've tried to add a
read_str_slice
method toDecoder
, and I have it halfworking. There are two implementations of
Decoder
.opaque::Decoder::read_str_slice
is working. It's just a simplemodification of
read_str
. I think it works becauseopaque::Decoder
has adata: &'a [u8]
field, and the string slice returned byread_str_slice
refers to that.
json::Decoder::read_str_slice
isn't working (and it's commented out in thecommit).
json::Decoder
doesn't hold onto thestr
from which it reads. Itried doing that, and a couple of other things, but I'm having trouble getting
past the borrow checker.
It seems like this should be doable, and in fact we should be able to replace
read_str
withread_str_slice
everywhere. (Moreover, the code as writtenactually works, because the
panic!
injson::Decoder::read_str_slice
isnever hit, at least not when just running the compiler, and it does give a
slight speed-up.)
Any suggestions on how to get
json::Decoder::read_str_slice
working?(And apologies for my lack of lifetime understanding.)