-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
How and When to Update `zig1.wasm`
Note that the guidance in this page largely depends on #22505, which is not yet merged at the time of writing.
The file stage1/zig1.wasm
is the start of the compiler's bootstrap chain. It is a minimal
compiler, with only the C backend enabled, built to a wasm binary. The bootstrap process uses
this compiler to build zig2.c
using the C backend, which in turn builds a final zig
using
the LLVM backend.
Occasionally, zig1.wasm needs to be updated to include new compiler logic. There are a few situations where this might be necessary:
- The compiler implements a new feature, which we want to use in the compiler codebase.
- The compiler has had a bugfix, and the bug in question is affecting the bootstrap process.
- A breaking change to
std.builtin
has been implemented, so the existing zig1.wasm is no longer compatible with the new standard library.
Updating zig1.wasm is done using the "update-zig1" build step; so, just by running
zig build update-zig1
. Note that to help mitigate concerns about supply chain attacks, only
Zig core team members are allowed to commit updates to zig1.wasm to master
.
Of the situations listed above, the first two only need a trivial update: simply run
zig build update-zig1
with some existing zig
, and everything is fine. However, the third
case -- handling a breaking change to std.builtin
-- can be considerably more complicated.
This is because the compiler being used to update zig1 will depend on the "old" version of
std.builtin
, but the compiler being built will depend on the "new" version of std.builtin
.
- A value is "comptime-known" if the compiler has a corresponding
Value
(i.e.InternPool.Index
). TheType
of thisValue
will be based on the code the compiler is building. - A value is "compiler-known" if the compiler actually has a value of that type. The type of this value will be based on the code this compiler was built against.
For instance, a running compiler might have a compiler-known std.builtin.CallingConvention
, and a
corresponding comptime-known value represented by a Value
. The complexities we're trying to solve here
arise from differences between the types of corresponding comptime-known and compiler-known values.
The compiler can translate between comptime-known and compiler-known values:
- A comptime-known value can "interpreted" (
Value.interpret
) to turn it into a compiler-known value. - A compiler-known value can "uninterpreted" (
Value.uninterpret
) to turn it into a comptime-known value.
When interpreting/uninterpreting, there are two "modes" the compiler can use:
-
InterpretMode.direct
assumes that the type is very similar to what the compiler was built with; the upshot is that it is simpler and faster. -
InterpretMode.by_name
matches enum/struct/union fields by name, so can handle more changes between the types compiler was built with and the types it's running with. However, it is slower and more complex.
Most compiler builds use InterpretMode.direct
. To minimize the need for zig1.wasm updates, zig1.wasm uses
InterpretMode.by_name
. The mode of a normal compiler build can be changed with -Dvalue-interpret-mode
.
Adding and removing fields from enums or tagged unions in std.builtin
will always work
without a zig1.wasm update. This is by design: zig1.wasm uses the .by_name
value interpret
mode to make sure this works. The motivating use case is routine updates to things like
CallingConvention
and AddressSpace
in response to target evolution.
After changing std.builtin
, simply bootstrap again to get a working compiler.
Adding fields to structs in std.builtin
will work without a zig1.wasm update, as long as:
- The compiler never uninterprets this type, or
- The added field has a default value for the compiler to use when uninterpreting.
If neither of the above applies:
- Build a compiler with
-Dvalue-interpret-mode=by_name
. - Give the field a default value, which it would be safe for the current compiler implementation to return when uninterpreting the type.
- Use the compiler built in step 1 to run update-zig1.
- Remove the default value.
- Bootstrap to get a final working compiler.
Removing fields from structs in std.builtin
will work without a zig1.wasm update only if
the compiler never interprets this type. Otherwise, the field can be removed by this process:
- Give the field a default value, which it would be safe for the current compiler implementation to assume once the field is removed.
- Build a compiler with
-Dvalue-interpret-mode=by_name
. - Remove the field in question.
- Use the compiler built in step 2 to run update-zig1.
- Bootstrap to get a final working compiler.
To rename fields on a std.builtin
type (enum, union, or struct), while keeping field order,
field types, backing types, and tag values intact:
- Build a compiler with
-Dvalue-interpret-mode=direct
(this is the default). - Rename the field[s] in question.
- Use the compiler built in step 1 to run update-zig1.
- Bootstrap to get a final working compiler.
Any other change is a more "fundamental" change to a std.builtin type; like, for instance,
when we turned CallingConvention
from an enum into a tagged union. It will require a more
involved update process. In short, you will need to keep both types intact to apply the
initial zig1.wasm update: the old one so that the compiler updating zig1 can use it, and the
new one for the new compiler code to reference.
Say you wish to change type std.builtin.Foo
. The process is as follows.
- Rather than modifying
Foo
, define the new type understd.builtin.NewFoo
. Leave the existing type definition untouched for now. - Modify the compiler (and, where required to build the compiler, the standard library) to
use
NewFoo
everywhere, and completely implement the new compiler behavior. However, when the compiler actually gets the correspondingType
fromstd.builtin
at runtime, it should use the name "Foo", not "NewFoo". For instance, aZcu.BuiltinDecl
field should be called.Foo
, not.NewFoo
. - Using an existing compiler, run update-zig1. Because the old
Foo
is intact, the compiler being used can reference it correctly; but the compiler being built uses the new type, currently namedNewFoo
. - Delete the old
Foo
declaration, and renameNewFoo
toFoo
. Update all references toNewFoo
to instead referenceFoo
. - Bootstrap to get a final working compiler.