Perf regression in Vec<u8> Write impl? #24095

frankmcsherry · 2015-04-05T18:06:51Z

Hi folks. A bunch of my serialization code recently got a lot slower, and I think I've tracked it down to writing binary data into a Vec<u8>. This used to be quite fast, well above the 8-10GB/s range. It is now about 1GB/s for me. The following main.rs demos the performance I'm seeing.

#![feature(test)]
extern crate test;

use test::Bencher;
use std::io::Write;

#[bench] 
fn bench(bencher: &mut Bencher) {
    let data = &[0u8; 4096];
    let mut buffer = Vec::with_capacity(data.len());
    bencher.bytes = data.len() as u64;
    bencher.iter(|| {
        buffer.clear();
        buffer.write_all(data).unwrap();
    });
}

fn main() {
    let data = &[0u8; 4096];
    let mut buffer = Vec::with_capacity(data.len());

    // writes 4GB, takes .. 4s+
    for _ in (0..(1 << 20)) {
        buffer.clear();
        buffer.write_all(data).unwrap();
    };
}

The perf numbers look like (where main writes 4GB in 4KB chunks):

Echidnatron% cargo bench; time cargo run --release 
     Running target/release/bench-5e9b1b37cda85a22

running 1 test
test bench ... bench:      4086 ns/iter (+/- 596) = 1002 MB/s

test result: ok. 0 passed; 0 failed; 0 ignored; 1 measured

     Running `target/release/bench`
cargo run --release  4.06s user 0.03s system 99% cpu 4.094 total
Echidnatron%

Sorry if this is old news, or a misdiagnosis. Something is a bit slower now, though. The comments in the source (Vec<T>::push_all) do suggest it isn't stable yet because the impl might get faster; I didn't expect it to get 10x slower on me though :).

Echidnatron% cargo --version
cargo 0.0.1-pre-nightly (d71f748 2015-04-03) (built 2015-04-04)

The text was updated successfully, but these errors were encountered:

alexcrichton · 2015-04-06T16:10:17Z

I think this is the same cause as #24014, at least the performance of Iterator for ops::Range comment.

Nominating for 1.0 as this is quite a critical iterator to have perform well.

triage: I-nominated

aturon · 2015-04-06T16:20:46Z

@alexcrichton I will try to fix this today.

A recent change to the implementation of range iterators meant that, even when stepping by 1, the iterators *always* involved checked arithmetic. This commit reverts to the earlier behavior (while retaining the refactoring into traits). Fixes rust-lang#24095 cc rust-lang#24014

@alexcrichton

A recent change to the implementation of range iterators meant that, even when stepping by 1, the iterators *always* involved checked arithmetic. This commit reverts to the earlier behavior (while retaining the refactoring into traits). Fixes #24095 Closes #24119 cc #24014 r? @alexcrichton

frankmcsherry · 2015-04-09T13:34:42Z

Can confirm with the new nightly that the #[bench] is reporting 50GB/s. <3

rust-highfive added the I-nominated label Apr 6, 2015

alexcrichton mentioned this issue Apr 6, 2015

compiler generate slower code in my benchmarks, after upgrade from 0.13.0 to master #24014

Closed

3 tasks

aturon mentioned this issue Apr 6, 2015

Fix range performance regression #24120

Merged

bors closed this as completed in #24120 Apr 8, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Perf regression in Vec<u8> Write impl? #24095

Perf regression in Vec<u8> Write impl? #24095

frankmcsherry commented Apr 5, 2015

alexcrichton commented Apr 6, 2015

Uh oh!

aturon commented Apr 6, 2015

Uh oh!

frankmcsherry commented Apr 9, 2015

Uh oh!

Perf regression in Vec<u8> Write impl? #24095

Perf regression in Vec<u8> Write impl? #24095

Comments

frankmcsherry commented Apr 5, 2015

alexcrichton commented Apr 6, 2015

Uh oh!

aturon commented Apr 6, 2015

Uh oh!

frankmcsherry commented Apr 9, 2015

Uh oh!