Skip to content

Add inline attr to CString::into_inner so it can optimize out NonNull checks #85719

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 1 commit into from
May 27, 2021

Conversation

elichai
Copy link
Contributor

@elichai elichai commented May 26, 2021

It seems that currently if you convert any of the standard library's container to a pointer and then to a NonNull pointer, all will optimize out the NULL check except CString(https://godbolt.org/z/YPKW9G5xn),
because for some reason CString::into_inner isn't inlined even though it's a private function that should compile into a simple mov instruction.

Adding a simple #[inline] attribute solves this, code example:

use std::ffi::CString;
use std::ptr::NonNull;

pub fn cstring_nonull(mut n: CString) -> NonNull<i8> {
    NonNull::new(CString::into_raw(n)).unwrap()
}

assembly before:

__ZN3wat14cstring_nonull17h371c755bcad76294E:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset %rbp, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register %rbp
	callq	__ZN3std3ffi5c_str7CString10into_inner17h28ece07b276e2878E
	testq	%rax, %rax
	je	LBB0_2
	popq	%rbp
	retq
LBB0_2:
	leaq	l___unnamed_1(%rip), %rdi
	leaq	l___unnamed_2(%rip), %rdx
	movl	$43, %esi
	callq	__ZN4core9panicking5panic17h92a83fa9085a8f73E
	.cfi_endproc

	.section	__TEXT,__const
l___unnamed_1:
	.ascii	"called `Option::unwrap()` on a `None` value"

l___unnamed_3:
	.ascii	"wat.rs"

	.section	__DATA,__const
	.p2align	3
l___unnamed_2:
	.quad	l___unnamed_3
	.asciz	"\006\000\000\000\000\000\000\000\006\000\000\000(\000\000"

Assembly after:

__ZN3wat14cstring_nonull17h9645eb9341fb25d7E:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset %rbp, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register %rbp
	movq	%rdi, %rax
	popq	%rbp
	retq
	.cfi_endproc

(Related discussion on zulip: https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/NonNull.20From.3CBox.3CT.3E.3E)

@rust-highfive
Copy link
Contributor

r? @kennytm

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 26, 2021
@m-ou-se
Copy link
Member

m-ou-se commented May 26, 2021

Thanks!

@bors r+

@bors
Copy link
Collaborator

bors commented May 26, 2021

📌 Commit 45099e6 has been approved by m-ou-se

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 26, 2021
@m-ou-se m-ou-se assigned m-ou-se and unassigned kennytm May 26, 2021
@m-ou-se m-ou-se added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label May 26, 2021
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request May 26, 2021
…r=m-ou-se

Add inline attr to CString::into_inner so it can optimize out NonNull checks

It seems that currently if you convert any of the standard library's container to a pointer and then to a NonNull pointer, all will optimize out the NULL check except `CString`(https://godbolt.org/z/YPKW9G5xn),
because for some reason `CString::into_inner` isn't inlined even though it's a private function that should compile into a simple `mov` instruction.

Adding a simple `#[inline]` attribute solves this, code example:
```rust
use std::ffi::CString;
use std::ptr::NonNull;

pub fn cstring_nonull(mut n: CString) -> NonNull<i8> {
    NonNull::new(CString::into_raw(n)).unwrap()
}
```

assembly before:
```asm
__ZN3wat14cstring_nonull17h371c755bcad76294E:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset %rbp, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register %rbp
	callq	__ZN3std3ffi5c_str7CString10into_inner17h28ece07b276e2878E
	testq	%rax, %rax
	je	LBB0_2
	popq	%rbp
	retq
LBB0_2:
	leaq	l___unnamed_1(%rip), %rdi
	leaq	l___unnamed_2(%rip), %rdx
	movl	$43, %esi
	callq	__ZN4core9panicking5panic17h92a83fa9085a8f73E
	.cfi_endproc

	.section	__TEXT,__const
l___unnamed_1:
	.ascii	"called `Option::unwrap()` on a `None` value"

l___unnamed_3:
	.ascii	"wat.rs"

	.section	__DATA,__const
	.p2align	3
l___unnamed_2:
	.quad	l___unnamed_3
	.asciz	"\006\000\000\000\000\000\000\000\006\000\000\000(\000\000"
```

Assembly after:
```asm
__ZN3wat14cstring_nonull17h9645eb9341fb25d7E:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset %rbp, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register %rbp
	movq	%rdi, %rax
	popq	%rbp
	retq
	.cfi_endproc
```

(Related discussion on zulip: https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/NonNull.20From.3CBox.3CT.3E.3E)
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request May 27, 2021
…r=m-ou-se

Add inline attr to CString::into_inner so it can optimize out NonNull checks

It seems that currently if you convert any of the standard library's container to a pointer and then to a NonNull pointer, all will optimize out the NULL check except `CString`(https://godbolt.org/z/YPKW9G5xn),
because for some reason `CString::into_inner` isn't inlined even though it's a private function that should compile into a simple `mov` instruction.

Adding a simple `#[inline]` attribute solves this, code example:
```rust
use std::ffi::CString;
use std::ptr::NonNull;

pub fn cstring_nonull(mut n: CString) -> NonNull<i8> {
    NonNull::new(CString::into_raw(n)).unwrap()
}
```

assembly before:
```asm
__ZN3wat14cstring_nonull17h371c755bcad76294E:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset %rbp, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register %rbp
	callq	__ZN3std3ffi5c_str7CString10into_inner17h28ece07b276e2878E
	testq	%rax, %rax
	je	LBB0_2
	popq	%rbp
	retq
LBB0_2:
	leaq	l___unnamed_1(%rip), %rdi
	leaq	l___unnamed_2(%rip), %rdx
	movl	$43, %esi
	callq	__ZN4core9panicking5panic17h92a83fa9085a8f73E
	.cfi_endproc

	.section	__TEXT,__const
l___unnamed_1:
	.ascii	"called `Option::unwrap()` on a `None` value"

l___unnamed_3:
	.ascii	"wat.rs"

	.section	__DATA,__const
	.p2align	3
l___unnamed_2:
	.quad	l___unnamed_3
	.asciz	"\006\000\000\000\000\000\000\000\006\000\000\000(\000\000"
```

Assembly after:
```asm
__ZN3wat14cstring_nonull17h9645eb9341fb25d7E:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset %rbp, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register %rbp
	movq	%rdi, %rax
	popq	%rbp
	retq
	.cfi_endproc
```

(Related discussion on zulip: https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/NonNull.20From.3CBox.3CT.3E.3E)
This was referenced May 27, 2021
bors added a commit to rust-lang-ci/rust that referenced this pull request May 27, 2021
Rollup of 8 pull requests

Successful merges:

 - rust-lang#84221 (E0599 suggestions and elision of generic argument if no canditate is found)
 - rust-lang#84701 (stabilize member constraints)
 - rust-lang#85564 ( readd capture disjoint fields gate)
 - rust-lang#85583 (Get rid of PreviousDepGraph.)
 - rust-lang#85649 (Update cc)
 - rust-lang#85689 (Remove Iterator #[rustc_on_unimplemented]s that no longer apply.)
 - rust-lang#85719 (Add inline attr to CString::into_inner so it can optimize out NonNull checks)
 - rust-lang#85725 (Remove unneeded workaround)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 955e0f4 into rust-lang:master May 27, 2021
@rustbot rustbot added this to the 1.54.0 milestone May 27, 2021
@elichai elichai deleted the cstring-into_inner-inline branch May 27, 2021 08:42
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants