-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
bits vs bytes: a complete specification of the memory model #10547
Comments
So even if the in memory layout is not defined, the fundamental int "layout" for that type is and the "layout" for that fundamental int depends on the field order it seems, which is why S -> Other works. |
Also, then builtin types (such as slices and optionals) need to have their fundamental int layout be defined even if its memory layout is not. |
Just writing down what // Assuming 64bit platform
var s: S = ...;
const Int = meta.Int(@bitSizeOf(S), .unsigned);
var int: Int = 0;
int |= @as(Int, @bitCast(u128, s.name)) << @bitOffsetOf(S, "name");
int |= @as(Int, @bitCast(u2, s.ok)) << @bitOffsetOf(S, "ok");
var o: Other = undefined;
o.name_ptr = @bitCast([*]const u8, @truncate(u64, int >> @bitOffsetOf(Other, "name_ptr")));
o.name_len = @bitCast(usize, @truncate(u64, int >> @bitOffsetOf(Other, "name_len")));
o.ok_present = @bitCast(u1, @truncate(u1, int >> @bitOffsetOf(Other, "ok_present")));
o.ok_flag = @bitCast(u1, @truncate(u1, int >> @bitOffsetOf(Other, "ok_flag"))); |
Also, seems Edit: And to implement this in the standard library, all you really need is Edit edit: Actually, no. This is not so easy to do in standard library when untagged unions with safety checks are a thing. |
One thing. How would this work with untagged unions? const U = union{
a: u8,
b: u16,
};
var u = U{.a= 0};
// There are some memory in this union which was not "set" and is probably `undefined`.
// Doing the bitcast here, we now have multiple integer representations that represents the same U{.a = 0}
// information.
var i = @bitCast(u16, u); // i is 0b00000000???????? |
Related: Lines 2600 to 2606 in d66c97d
I feel like a solution to one of these questions may solve the other as well. At the very least having both in mind while thinking about solutions would likely be useful. In particular, what are the semantics of this: const U = union {
a: u3,
b: u5,
};
// What is the debug safety tag of u set to?
// Should we have an "unknown" safety tag which disables safety checks until the union
// is assigned in a way that the compiler knows the intended tag?
var u = @bitCast(U, 0b11111);
// e.g. we could now set the safety tag to `a` here:
u = .{ .a = 0b111 }; |
I really don't like this. It seems like a fairly complex transformation (no new programmer would ever guess that this is what bitcast does), and I don't see motivating use cases for that complexity. Let's examine the motivations in the issue:
That's good motivation but doesn't explain the complexity
The proposal doesn't really talk about composing at all, I'm finding it very difficult to see how this is relevant. Since we're not defining anything related to memory layout (this proposal only specifies a transformation done by bitcast, despite claiming to have something to do with a memory model), it's not really related to align(0) or the representation of nested packed structs. I'd like to see some real use cases for the proposed transformation, because I'm having difficulty coming up with any that would be a good idea in real code.
This proposal does nothing to make this possible. You can still take the address of the inner
The proposed transform is quite inefficient on most hardware. It can't undergo a SIMD transformation. It's true that computers are faster at doing math / unpacking operations than loading memory, but the more operations they spend unpacking the fewer they can spend doing math. I really need to see more use cases but I'm currently unconvinced that the benefits of this are worth the cost in complexity. |
Closed in order to simplify the language. fn bitCast(Dest: type, x: anytype) Dest {
return @ptrCast(*Dest, &x).*;
}
fn bitSizeOf(T: type) comptime_int {
const P = packed struct { t: T };
return @TypeInfo(@structToInt(P)).Int.bits;
} Most types are not allowed in packed structs. Only ones allowed are integer-like types:
Anything other than these gives a compile error if used with
Answers to some specific questions:
|
I found this issue after already giving this a lot of thought so I decided to share my view on it. The way I see it alignment of some type is not defined by that type but by its container. If we take Zig is currently not consistent on this front: const T1 = packed struct { f1: u24 }; // @sizeOf(T1) == 4
const T2 = packed struct {f1: u24, f2: u8}; // @sizeOf(T2) == 4 This means that Next problem are arrays. To keep this logic, they as containers also now need to define alignment so we can have "normal" I would really like to understand this problem fully now that I am so deeply in it so I am eager to here what am I missing. Why can't we have this? |
The critical problem is that sizeof is, by definition, array stride. If we didn't include padding in sizeof, we would need a separate built-in for array stride. Additionally, since alignment is an attribute of the container, not the type, alignment cannot affect layout (and therefore cannot affect size). |
I just don't see how can we support u24, packed structs that make sense and not have built-in for array stride and alignment for arrays as well. For example currently the following is true: If I understand your logic the above is not considered a bug and will remain so in the future (also the test at test/behavior/struct.zig on line 431 confirms it is valid expectation). |
The current (stage 1) implementation of packed on which you are basing your arguments is not the correct behavior. |
So if someone (like me :) ) would manage to fix packed structs in stage1 they would be right to change the test/behavior/struct.zig like it is done here: https://github.com/ziglang/zig/pull/3745/files#diff-2543d9651c81667654ce214fd04ab490787f15ed52623cfe29339c98cc482426L256 ? |
Yep, that would be needed. Though those tests are run by both stage 1 and stage 2, so you would need to either split the tests or fix both. The current implementation in stage 2 rounds up to a multiple of a machine word size, instead of taking the minimum number of bytes needed. |
That test starts with these lines so I guess it should be fine to also just add
|
Probably best to keep the current tests for stage 2 targets that work, since there are no other packed struct tests for stage 2. |
This proposal makes a distinction between the in-memory storage of a value and the storage of a value according to information theory.
In some cases, these are the same thing, such as with a
u8
type. In other cases they can be different for various reasons, such as endianness or aggregate padding.The idea here is that these two operations would both be well-defined, but would produce different results; they would not be defined in terms of each other:
byte casting:
bit casting:
Byte casting is done via pointer reinterpretation, or via
extern union
field aliasing. Bit casting is done via the@bitCast
language primitive. They are both well defined but distinct:Byte casting is easy to understand; you can almost implement it by accident. Here we focus on bit casting and what it means. First, some prerequisites:
@bitSizeOf
vs@sizeOf
- sizeof corresponds to bytes. It takes into account padding. As an example,@sizeOf(u24) == 4
. Meanwhile,@bitSizeOf
corresponds to information theory. In this example,@bitSizeOf(u24) == 24
. bit size ignores padding. The bit size of a struct, regardless of whether it is packed or extern or not, is the sum of@bitSizeOf
for each field.@bitOffsetOf
vs@byteOffsetOf
- for bytes it points to the difference in memory address between a field and the base pointer. For bits it tells the number of lower bits that precede the field in a hypothetical integer with bits equal to the@bitSizeOf
the aggregate.With this proposal, each type, regardless of whether it has a well-defined memory layout or not (which applies to bytes), it has a hypothetical integer with a number of bits equal to the
@bitSizeOf
that type. We call this integer the type's fundamental int.@bitCast
is defined as follows:Attempting to
@bitCast
between two types that have differing@bitSizeOf
values is a compile error. Note that one can obtain the fundamental int for a type by bit casting the value to an unsigned integer.The motivation for this proposal is:
align(0)
useful in general.??u8
, whose fundamental integer would be au10
.With this proposal, one would be able to convert between structs, even though they have no well-defined byte representation, like this:
The text was updated successfully, but these errors were encountered: