Skip to content

Avoid frequent string allocations in Decoder #37022

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
wants to merge 1 commit into from

Conversation

nnethercote
Copy link
Contributor

@nnethercote nnethercote commented Oct 7, 2016

The single biggest cause of allocations in the compiler is this function:

impl Decodable for InternedString {
    fn decode<D: Decoder>(d: &mut D) -> Result<InternedString, D::Error> {
        Ok(intern(d.read_str()?.as_ref()).as_str())
    }
}

This is called a lot while reading crate metadata, and for some benchmarks it's
the cause of ~10% of all calls to malloc(). And it's not really necessary --
d.read_str() creates a string which we immediately convert to a &str with
as_ref.

So I've tried to add a read_str_slice method to Decoder, and I have it half
working. There are two implementations of Decoder.

  • opaque::Decoder::read_str_slice is working. It's just a simple
    modification of read_str. I think it works because opaque::Decoder has a
    data: &'a [u8] field, and the string slice returned by read_str_slice
    refers to that.
  • json::Decoder::read_str_slice isn't working (and it's commented out in the
    commit). json::Decoder doesn't hold onto the str from which it reads. I
    tried doing that, and a couple of other things, but I'm having trouble getting
    past the borrow checker.

It seems like this should be doable, and in fact we should be able to replace
read_str with read_str_slice everywhere. (Moreover, the code as written
actually works, because the panic! in json::Decoder::read_str_slice is
never hit, at least not when just running the compiler, and it does give a
slight speed-up.)

Any suggestions on how to get json::Decoder::read_str_slice working?
(And apologies for my lack of lifetime understanding.)

@rust-highfive
Copy link
Contributor

r? @sfackler

(rust_highfive has picked a reviewer for you, use r? to override)

@Mark-Simulacrum
Copy link
Member

I think it would be easier and more practical to have unimplemented!() as the definition for json::Decoder::read_str_slice especially since rustc itself doesn't use it.

@@ -2170,7 +2168,7 @@ impl ::Decoder for Decoder {
}

fn read_char(&mut self) -> DecodeResult<char> {
let s = self.read_str()?;
let s = self.read_str()?; // njn: use read_str_slice() instead?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

json::Decoder::read_str() just unwraps an already-allocated Json::String(String).

@sinkuu
Copy link
Contributor

sinkuu commented Oct 8, 2016

Any suggestions on how to get json::Decoder::read_str_slice working?

Storing the String somewhere so that it won't be dropped?

pub struct Decoder {
    stack: Vec<Json>,
    str: Option<String>,
}

impl Decoder {
    fn read_str_slice(&mut self) -> DecodeResult<&str> {
        match self.pop() {
            Json::String(v) => {
                self.str = Some(v);
                Ok(self.str.as_ref().unwrap())
            }
            // ...
        }
    }
}

@nnethercote
Copy link
Contributor Author

@sinkuu: thank you for the suggestion! I had already tried something similar, except that I used a Vec<String> instead of an Option<String> and I pushed every String used in read_str_slice onto the Vec.

I confess I don't understand how your Option<String> version works. The first time read_str_slice is called, None is replaced with Some(s1), which is fine, and we return a reference to s1. But the second time read_str_slice is called, what happens? Some(s1) is replaced with Some(s2)... what happens to s1? I would expect it to be dropped, which would mean the reference returned by the previous read_str_slice call would become invalid, but that reference is still usable because its lifetime is the same as that of self. I must be misunderstanding something...

@Mark-Simulacrum
Copy link
Member

I believe that while you have the (returned) reference to the Option's contents, you cannot call read_str_slice again, since that would invalidate the reference. I might be wrong though...

@nnethercote
Copy link
Contributor Author

Huh, ok. I had been thinking I would have to add a reference to the original string being read to Json, and then change it to Json<'a>, and change String(String) within it to String<&'a str>, and something similar for Array and Object. This is a lot easier! :)

@nnethercote
Copy link
Contributor Author

I updated the code to use @sinkuu's suggestion, and I also removed read_str_slice and changed read_str to return a slice.

Here are the rustc-benchmarks. It's a 0.5%--1.5% speed-up on many of them.

stage1 compiler doing debug builds:

futures-rs-test  4.461s vs  4.405s --> 1.013x faster (variance: 1.002x, 1.011x)
helloworld       0.235s vs  0.234s --> 1.006x faster (variance: 1.009x, 1.026x)
html5ever-2016-  7.663s vs  7.609s --> 1.007x faster (variance: 1.002x, 1.005x)
hyper.0.5.0      5.203s vs  5.137s --> 1.013x faster (variance: 1.005x, 1.006x)
inflate-0.1.0    4.887s vs  4.893s --> 0.999x faster (variance: 1.011x, 1.005x)
issue-32062-equ  0.350s vs  0.345s --> 1.016x faster (variance: 1.019x, 1.015x)
issue-32278-big  1.700s vs  1.706s --> 0.996x faster (variance: 1.010x, 1.008x)
jld-day15-parse  1.558s vs  1.551s --> 1.004x faster (variance: 1.051x, 1.006x)
piston-image-0. 12.087s vs 12.027s --> 1.005x faster (variance: 1.039x, 1.003x)
regex.0.1.30     2.579s vs  2.544s --> 1.014x faster (variance: 1.016x, 1.026x)
rust-encoding-0  2.124s vs  2.106s --> 1.009x faster (variance: 1.004x, 1.015x)
syntex-0.42.2   33.014s vs 32.728s --> 1.009x faster (variance: 1.005x, 1.011x)
syntex-0.42.2-i 18.089s vs 17.937s --> 1.008x faster (variance: 1.012x, 1.012x)

stage2 compiler doing debug builds

futures-rs-test  4.018s vs  3.992s --> 1.006x faster (variance: 1.007x, 1.001x)
helloworld       0.236s vs  0.233s --> 1.015x faster (variance: 1.023x, 1.018x)
html5ever-2016-  6.487s vs  6.373s --> 1.018x faster (variance: 1.009x, 1.019x)
hyper.0.5.0      4.626s vs  4.619s --> 1.002x faster (variance: 1.007x, 1.001x)
inflate-0.1.0    4.545s vs  4.545s --> 1.000x faster (variance: 1.010x, 1.005x)
issue-32062-equ  0.335s vs  0.333s --> 1.006x faster (variance: 1.011x, 1.015x)
issue-32278-big  1.603s vs  1.599s --> 1.003x faster (variance: 1.004x, 1.008x)
jld-day15-parse  1.419s vs  1.403s --> 1.011x faster (variance: 1.004x, 1.015x)
piston-image-0. 10.845s vs 10.736s --> 1.010x faster (variance: 1.009x, 1.004x)
regex.0.1.30     2.340s vs  2.308s --> 1.014x faster (variance: 1.010x, 1.011x)
rust-encoding-0  1.961s vs  1.950s --> 1.006x faster (variance: 1.021x, 1.004x)
syntex-0.42.2   29.134s vs 28.730s --> 1.014x faster (variance: 1.006x, 1.018x)
syntex-0.42.2-i 16.587s vs 16.503s --> 1.005x faster (variance: 1.026x, 1.023x)

@arielb1
Copy link
Contributor

arielb1 commented Oct 8, 2016

cc @eddyb

@eddyb
Copy link
Member

eddyb commented Oct 8, 2016

iMO, this should use Cow<str> and the description needs to be updated, but otherwise it's great!

`opaque::Decoder::read_str` is very hot within `rustc` due to its use in
the reading of crate metadata, and it currently returns a `String`. This
commit changes it to instead return a `Cow<str>`, which avoids a heap
allocation.

This change reduces the number of calls to `malloc` by almost 10% in
some benchmarks.

This is a [breaking-change] to libserialize.
@nnethercote
Copy link
Contributor Author

r? @eddyb -- I updated to return Cow<string>. Thank you for the suggestion -- it allowed me to remove json::Decoder::str, which is nice.

@rust-highfive rust-highfive assigned eddyb and unassigned sfackler Oct 9, 2016
expect!(self.pop(), String)
fn read_str(&mut self) -> DecodeResult<Cow<str>> {
match self.pop() {
Json::String(v) => Ok(Cow::Owned(v)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you do expect!(self.pop(), String).map(Cow::Owned)?

@@ -566,7 +566,7 @@ impl PartialEq<InternedString> for str {

impl Decodable for InternedString {
fn decode<D: Decoder>(d: &mut D) -> Result<InternedString, D::Error> {
Ok(intern(d.read_str()?.as_ref()).as_str())
Ok(intern(d.read_str()?.deref()).as_str())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A & before d is better IMO.

@nnethercote
Copy link
Contributor Author

@eddyb: I totally borked my git repo so I had to do the follow-ups in a new PR: #37064. Sorry!

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants