Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Ipv4Addr cmp() is slow #33885

Closed
polachok opened this issue May 26, 2016 · 1 comment
Closed

Ipv4Addr cmp() is slow #33885

polachok opened this issue May 26, 2016 · 1 comment
Labels
I-slow Issue: Problems and improvements with respect to performance of generated code.

Comments

@polachok
Copy link
Contributor

I'm using Ipv4Addrs as keys in a BtreeMap. I have to look up the map ~15 million times per second (10G ethernet line rate).

Ipv4Addr cmp() compiles to this on my system (linux x86-64, rustc 1.10.0-nightly (476fe6e 2016-05-21))

0000000000051f10 <_ZN52_$LT$net..ip..Ipv4Addr$u20$as$u20$core..cmp..Ord$GT$3cmp17h39591ec7a18c4b02E>:
   51f10:       50                      push   %rax
   51f11:       c7 44 24 04 1d 1d 1d    movl   $0x1d1d1d1d,0x4(%rsp)
   51f18:       1d 
   51f19:       c7 04 24 1d 1d 1d 1d    movl   $0x1d1d1d1d,(%rsp)
   51f20:       8b 07                   mov    (%rdi),%eax
   51f22:       89 c1                   mov    %eax,%ecx
   51f24:       88 44 24 04             mov    %al,0x4(%rsp)
   51f28:       88 64 24 05             mov    %ah,0x5(%rsp)
   51f2c:       c1 e8 10                shr    $0x10,%eax
   51f2f:       c1 e9 18                shr    $0x18,%ecx
   51f32:       88 44 24 06             mov    %al,0x6(%rsp)
   51f36:       88 4c 24 07             mov    %cl,0x7(%rsp)
   51f3a:       8b 06                   mov    (%rsi),%eax
   51f3c:       89 c1                   mov    %eax,%ecx
   51f3e:       88 04 24                mov    %al,(%rsp)
   51f41:       88 64 24 01             mov    %ah,0x1(%rsp)
   51f45:       c1 e8 10                shr    $0x10,%eax
   51f48:       c1 e9 18                shr    $0x18,%ecx
   51f4b:       88 44 24 02             mov    %al,0x2(%rsp)
   51f4f:       88 4c 24 03             mov    %cl,0x3(%rsp)
   51f53:       48 8d 7c 24 04          lea    0x4(%rsp),%rdi
   51f58:       48 8d 34 24             lea    (%rsp),%rsi
   51f5c:       ba 04 00 00 00          mov    $0x4,%edx
   51f61:       e8 0a 17 fc ff          callq  13670 <memcmp@plt>
   51f66:       89 c1                   mov    %eax,%ecx
   51f68:       31 c0                   xor    %eax,%eax
   51f6a:       85 c9                   test   %ecx,%ecx
   51f6c:       b1 ff                   mov    $0xff,%cl
   51f6e:       78 02                   js     51f72 <_ZN52_$LT$net..ip..Ipv4Addr$u20$as$u20$core..cmp..Ord$GT$3cmp17h39591ec7a18c4b02E+0x62>
   51f70:       b1 01                   mov    $0x1,%cl
   51f72:       74 02                   je     51f76 <_ZN52_$LT$net..ip..Ipv4Addr$u20$as$u20$core..cmp..Ord$GT$3cmp17h39591ec7a18c4b02E+0x66>
   51f74:       88 c8                   mov    %cl,%al
   51f76:       59                      pop    %rcx

Which seems kinda inefficient for a thing which is basically u32.
I guess part of the reason is the implementation which converts it to an array(!) first.
I copied the definition and implemented Ord like this:

impl Ord for Ipv4Addr2 {
    fn cmp(&self, other: &Self) -> cmp::Ordering {
        return Ord::cmp(&ntoh(self.inner.s_addr), &ntoh(other.inner.s_addr));
    }
}

It's about 10 times faster on my benchmark.

rustc 1.10.0-nightly (476fe6eef 2016-05-21)
binary: rustc
commit-hash: 476fe6eefe17db91ff7a60aab34aa67a0a750a18
commit-date: 2016-05-21
host: x86_64-unknown-linux-gnu
release: 1.10.0-nightly
@alexcrichton
Copy link
Member

Nice find! Want to send a PR for this? Looks like something that'd be more than welcome :)

@apasel422 apasel422 added A-libs I-slow Issue: Problems and improvements with respect to performance of generated code. labels May 26, 2016
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue May 27, 2016
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
I-slow Issue: Problems and improvements with respect to performance of generated code.
Projects
None yet
Development

No branches or pull requests

3 participants