Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Change single char str patterns to chars #52646

Merged
merged 1 commit into from
Jul 24, 2018

Conversation

ljedrz
Copy link
Contributor

@ljedrz ljedrz commented Jul 23, 2018

A char is faster.

@rust-highfive
Copy link
Contributor

r? @michaelwoerister

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 23, 2018
@petrochenkov
Copy link
Contributor

A char is faster

I wouldn't be so sure, see #41993
str is "search bytes in bytes", while char is "convert char to bytes, then search bytes in bytes".

@ljedrz
Copy link
Contributor Author

ljedrz commented Jul 23, 2018

Oh, ok; in that case I guess this might call for an update in clippy, as its single_char_pattern lint suggests that it's the other way round.

cc @oli-obk

@ljedrz
Copy link
Contributor Author

ljedrz commented Jul 23, 2018

@petrochenkov I just did a comparison using godbolt and the char variant produces half the str assembly; perhaps that issue is limited to starts_with? I'm not an expert, but this is confusing.

@killercup
Copy link
Member

A char is faster.

I don't see any benchmarks? :)

@ljedrz
Copy link
Contributor Author

ljedrz commented Jul 23, 2018

@killercup see the related clippy lint.

@kennytm
Copy link
Member

kennytm commented Jul 23, 2018

@petrochenkov #41993 only concerns about .starts_with(). The split() algorithm for char uses memchr() for a char and is way more efficient than that for a &str (which degenerates to naive search).

@killercup
Copy link
Member

killercup commented Jul 23, 2018

@ljedrz a piece of documentation is still no benchmark :)

Seriously, if we don't have one, we should totally write one and add it. This is easily decidable once we have the facts.

Edit: Maybe we can use the ones @kennytm wrote last year:
#41993 (comment). The most significant change to land since then was probably #46735. And if nothing else, we can link the results in the clippy docs.

Edit 2: We are currently linking to libc::memchr on most platforms. Adding a SIMD-aware Rust version might allow for better inlining. (But again, without benchmarks this is just speculation.)

@kennytm
Copy link
Member

kennytm commented Jul 23, 2018

Since I just happened to have benchmarks from rust-lang/rfcs#2500, here's the relevant parts comparing .starts_with() and .split():

.starts_with()
test pat3_starts_with_ascii_char::long_lorem_ipsum      ... bench:         174 ns/iter (+/- 12)
test pat3_starts_with_ascii_char::short_ascii           ... bench:         175 ns/iter (+/- 21)
test pat3_starts_with_ascii_char::short_mixed           ... bench:         174 ns/iter (+/- 9)
test pat3_starts_with_ascii_char::short_pile_of_poo     ... bench:         175 ns/iter (+/- 26)
test pat3_starts_with_ascii_string::long_lorem_ipsum    ... bench:         136 ns/iter (+/- 20)
test pat3_starts_with_ascii_string::short_ascii         ... bench:         138 ns/iter (+/- 26)
test pat3_starts_with_ascii_string::short_mixed         ... bench:         137 ns/iter (+/- 13)
test pat3_starts_with_ascii_string::short_pile_of_poo   ... bench:         136 ns/iter (+/- 15)
test std_starts_with_ascii_char::long_lorem_ipsum       ... bench:         339 ns/iter (+/- 29)
test std_starts_with_ascii_char::short_ascii            ... bench:         338 ns/iter (+/- 21)
test std_starts_with_ascii_char::short_mixed            ... bench:         844 ns/iter (+/- 42)
test std_starts_with_ascii_char::short_pile_of_poo      ... bench:         930 ns/iter (+/- 91)
test std_starts_with_ascii_string::long_lorem_ipsum     ... bench:         306 ns/iter (+/- 41)
test std_starts_with_ascii_string::short_ascii          ... bench:         307 ns/iter (+/- 63)
test std_starts_with_ascii_string::short_mixed          ... bench:         252 ns/iter (+/- 122)
test std_starts_with_ascii_string::short_pile_of_poo    ... bench:         222 ns/iter (+/- 193)

Speed ups:

Method Long ASCII Short ASCII Short Mixed Short Unicode
.starts_with('/') from libstd (baseline) 0% 0% 0% 0%
.starts_with("/") from libstd -10% -9% -70% -76%
.starts_with('/') from RFC 2500 -49% -48% -79% -81%
.starts_with("/") from RFC 2500 -60% -59% -84% -85%
.split()
test pat3_split_a_char::long_lorem_ipsum                ... bench:       3,084 ns/iter (+/- 517)
test pat3_split_a_char::short_ascii                     ... bench:         158 ns/iter (+/- 15)
test pat3_split_a_char::short_mixed                     ... bench:         102 ns/iter (+/- 27)
test pat3_split_a_char::short_pile_of_poo               ... bench:          13 ns/iter (+/- 0)
test pat3_split_a_str::long_lorem_ipsum                 ... bench:       5,592 ns/iter (+/- 692)
test pat3_split_a_str::short_ascii                      ... bench:         180 ns/iter (+/- 23)
test pat3_split_a_str::short_mixed                      ... bench:         160 ns/iter (+/- 17)
test pat3_split_a_str::short_pile_of_poo                ... bench:         109 ns/iter (+/- 9)
test std_split_a_char::long_lorem_ipsum                 ... bench:       3,003 ns/iter (+/- 231)
test std_split_a_char::short_ascii                      ... bench:         159 ns/iter (+/- 11)
test std_split_a_char::short_mixed                      ... bench:         108 ns/iter (+/- 2)
test std_split_a_char::short_pile_of_poo                ... bench:          23 ns/iter (+/- 0)
test std_split_a_str::long_lorem_ipsum                  ... bench:       6,176 ns/iter (+/- 693)
test std_split_a_str::short_ascii                       ... bench:         194 ns/iter (+/- 7)
test std_split_a_str::short_mixed                       ... bench:         170 ns/iter (+/- 17)
test std_split_a_str::short_pile_of_poo                 ... bench:         109 ns/iter (+/- 16)

Speed ups:

Method Long ASCII Short ASCII Short Mixed Short Unicode
.split('a').count() from libstd (baseline) 0% 0% 0% 0%
.split("a").count() from libstd +106% +22% +57% +374%
.split('a').count() from RFC 2500 +3% -1% -6% -43%
.split("a").count() from RFC 2500 +86% +13% +48% +374%

While .starts_with('/') is much slower than .starts_with("/"), we see that the situation is reversed that .split('a').count() is much faster than .split("a").count(). This is because we did not optimize for single-byte string searching at all, but we do optimize for ASCII char searching.

@michaelwoerister
Copy link
Member

Thanks for the PR, @ljedrz!

@kennytm's numbers suggest that this should be a win (although none of the modified code is really on the critical path).

@bors r+

@bors
Copy link
Collaborator

bors commented Jul 24, 2018

📌 Commit 49c8ba9 has been approved by michaelwoerister

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 24, 2018
@bors
Copy link
Collaborator

bors commented Jul 24, 2018

⌛ Testing commit 49c8ba9 with merge a2af866...

bors added a commit that referenced this pull request Jul 24, 2018
Change single char str patterns to chars

A `char` is faster.
@bors
Copy link
Collaborator

bors commented Jul 24, 2018

☀️ Test successful - status-appveyor, status-travis
Approved by: michaelwoerister
Pushing a2af866 to master...

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants