Skip to content

Be more explicit about the layout guarantees of integer and floating-point types #98

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 11 commits into from
Apr 11, 2019
4 changes: 3 additions & 1 deletion reference/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,11 @@

- [Data layout](./layout.md)
- [Structs and tuples](./layout/structs-and-tuples.md)
- [Integers and Floating Points](./layout/integers-floatingpoint.md)
- [Scalars](./layout/scalars.md)
- [Enums](./layout/enums.md)
- [Unions](./layout/unions.md)
- [Pointers](./layout/pointers.md)
- [Function pointers](./layout/function-pointers.md)
- [Arrays and Slices](./layout/arrays-and-slices.md)
- [Packed SIMD vectors](./layout/packed-simd-vectors.md)
- [Optimizations](./optimizations.md)
Expand Down
1 change: 0 additions & 1 deletion reference/src/layout.md

This file was deleted.

10 changes: 6 additions & 4 deletions reference/src/layout/function-pointers.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,11 @@ bool for_all(struct Cons const *self, bool (*func)(int, void *), void *thunk);
```

```rust
# use std::{
# ffi::c_void,
# os::raw::c_int,
# };
#
pub struct Cons {
data: c_int,
next: Option<Box<Cons>>,
Expand Down Expand Up @@ -117,9 +122,6 @@ pub extern "C" fn for_all(
}
it = node.next.as_ref().map(|x| &**x);
}
true
}
```

### Unresolved Questions

- dunno
61 changes: 0 additions & 61 deletions reference/src/layout/integers-floatingpoint.md

This file was deleted.

4 changes: 2 additions & 2 deletions reference/src/layout/pointers.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ multi-trait objects `&(dyn T + U)` or references to other dynamically sized type
other than that they are at least word-aligned, and have size at least one word.

The layout of `&dyn T` when `T` is a trait is the same as that of:
```rust
```rust,ignore
#[repr(C)]
struct DynObject {
data: *u8,
Expand All @@ -45,7 +45,7 @@ struct DynObject {
```

The layout of `&[T]` is the same as that of:
```rust
```rust,ignore
#[repr(C)]
struct Slice<T> {
ptr: *T,
Expand Down
114 changes: 114 additions & 0 deletions reference/src/layout/scalars.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# Layout of scalar types

This chapter represents the consensus from issue [#9]. It documents the memory
layout and considerations for `bool`, `char`, floating point types (`f{32, 64}`), and integral types (`{i,u}{8,16,32,64,128,size}`).

These types are all scalar types, representing a single value, and have no
layout `#[repr()]` flags.

[#9]: https://github.com/rust-rfcs/unsafe-code-guidelines/issues/9

## `bool`

Rust's `bool` has the same layout as C17's` _Bool`, that is, its size and
alignment are implementation-defined. Any `bool` can be cast into an integer,
taking on the values 1 (`true`) or 0 (`false`).

> **Note**: on all platforms that Rust's currently supports, its size and
> alignment are 1, and its ABI class is `INTEGER` - see [Rust Layout and ABIs].

[Rust Layout and ABIs]: https://gankro.github.io/blah/rust-layouts-and-abis/#the-layoutsabis-of-builtins

## `char`

Rust char is 32-bit wide and represents an [unicode scalar value]. The alignment
of `char` is _implementation-defined_.

[unicode scalar value]: http://www.unicode.org/glossary/#unicode_scalar_value

> **Note**: Rust `char` type is not layout compatible with C / C++ `char` types.
> The C / C++ `char` types correspond to either Rust's `i8` or `u8` types on all
> currently supported platforms, depending on their signedness. Rust does not
> support C platforms in which C `char` is not 8-bit wide.

## `isize` and `usize`

The `isize` and `usize` types are pointer-sized signed and unsigned integers.
They have the same layout as the [pointer types] for which the pointee is
`Sized`, and are layout compatible with C's `uintptr_t` and `intptr_t` types.

> **Note**: Rust's `usize` and C's `unsigned` types are **not** equivalent. C's
> `unsigned` is at least as large as a short, allowed to have padding bits, etc.
> but it is not necessarily pointer-sized.

> **Note**: in the current Rust implementation, the layouts of `isize` and
> `usize` determine the following:
>
> * the maximum size of Rust _allocations_ is limited to `isize::max_value()`.
> The LLVM `getelementptr` instruction uses signed-integer field offsets. Rust
> calls `getelementptr` with the `inbounds` flag which assumes that field
> offsets do not overflow,
>
> * the maximum number of elements in an array is `usize::max_value()` (`[T; N:
> usize]`. Only ZST arrays can probably be this large in practice, non-ZST
> arrays are bound by the maximum size of Rust values,
>
> * the maximum value by which a pointer can be offseted using `ptr.add(count:
> usize)` is `usize::max_value()`.
>
> These limits have not gone through the RFC process and are not guaranteed to
> hold.

[pointer types]: ./pointers.md

## Fixed-width integer types

Rust's signed and unsigned fixed-width integer types `{i,u}{8,16,32,64}` have
the same layout as the C fixed-width integer types from the `<stdint.h>` header
`{u,}int{8,16,32,64}_t`. That is:

* these types have no padding bits,
* their size exactly matches their bit-width,
* negative values of signed integer types are represented using 2's complement.

This properties also hold for Rust's 128-bit wide `{i,u}128` integer types, but
C does not expose equivalent types in `<stdint.h>`.

Rust fixed-width integer types are therefore safe to use directly in C FFI where
the corresponding C fixed-width integer types are expected.

### Layout compatibility with C native integer types

The specification of native C integer types, `char`, `short`, `int`, `long`,
... as well as their `unsigned` variants, guarantees a lower bound on their size,
e.g., `short` is _at least_ 16-bit wide and _at least_ as wide as `char`.

Their exact sizes are _implementation-defined_.

Libraries like `libc` use knowledge of this _implementation-defined_ behavior on
each platform to select a layout-compatible Rust fixed-width integer type when
interfacing with native C integer types (e.g. `libc::c_int`).

> **Note**: Rust does not support C platforms on which the C native integer type
> are not compatible with any of Rust's fixed-width integer type (e.g. because
> of padding-bits, lack of 2's complement, etc.).

## Fixed-width floating point types

Rust's `f32` and `f64` single (32-bit) and double (64-bit) precision
floating-point types have [IEEE-754] `binary32` and `binary64` floating-point
layouts, respectively.

When the platforms' `"math.h"` header defines the `__STDC_IEC_559__` macro,
Rust's floating-point types are safe to use directly in C FFI where the
appropriate C types are expected (`f32` for `float`, `f64` for `double`).

If the C platform's `"math.h"` header does not define the `__STDC_IEC_559__`
macro, whether using `f32` and `f64` in C FFI is safe or not for which C type is
_implementation-defined_.

> **Note**: the `libc` crate uses knowledge of each platform's
> _implementation-defined_ behavior to provide portable `libc::c_float` and
> `libc::c_double` types that can be used to safely interface with C via FFI.

[IEEE-754]: https://en.wikipedia.org/wiki/IEEE_754
1 change: 0 additions & 1 deletion reference/src/optimizations.md

This file was deleted.