Skip to content

How and When to Update `zig1.wasm`

Matthew Lugg edited this page Jan 16, 2025 · 1 revision

Note that the guidance in this page largely depends on #22505, which is not yet merged at the time of writing.

Background

The file stage1/zig1.wasm is the start of the compiler's bootstrap chain. It is a minimal compiler, with only the C backend enabled, built to a wasm binary. The bootstrap process uses this compiler to build zig2.c using the C backend, which in turn builds a final zig using the LLVM backend.

Occasionally, zig1.wasm needs to be updated to include new compiler logic. There are a few situations where this might be necessary:

  • The compiler implements a new feature, which we want to use in the compiler codebase.
  • The compiler has had a bugfix, and the bug in question is affecting the bootstrap process.
  • A breaking change to std.builtin has been implemented, so the existing zig1.wasm is no longer compatible with the new standard library.

Updating zig1.wasm is done using the "update-zig1" build step; so, just by running zig build update-zig1. Note that to help mitigate concerns about supply chain attacks, only Zig core team members are allowed to commit updates to zig1.wasm to master.

Of the situations listed above, the first two only need a trivial update: simply run zig build update-zig1 with some existing zig, and everything is fine. However, the third case -- handling a breaking change to std.builtin -- can be considerably more complicated. This is because the compiler being used to update zig1 will depend on the "old" version of std.builtin, but the compiler being built will depend on the "new" version of std.builtin.

Terminology

  • A value is "comptime-known" if the compiler has a corresponding Value (i.e. InternPool.Index). The Type of this Value will be based on the code the compiler is building.
  • A value is "compiler-known" if the compiler actually has a value of that type. The type of this value will be based on the code this compiler was built against.

For instance, a running compiler might have a compiler-known std.builtin.CallingConvention, and a corresponding comptime-known value represented by a Value. The complexities we're trying to solve here arise from differences between the types of corresponding comptime-known and compiler-known values.

The compiler can translate between comptime-known and compiler-known values:

  • A comptime-known value can "interpreted" (Value.interpret) to turn it into a compiler-known value.
  • A compiler-known value can "uninterpreted" (Value.uninterpret) to turn it into a comptime-known value.

When interpreting/uninterpreting, there are two "modes" the compiler can use:

  • InterpretMode.direct assumes that the type is very similar to what the compiler was built with; the upshot is that it is simpler and faster.
  • InterpretMode.by_name matches enum/struct/union fields by name, so can handle more changes between the types compiler was built with and the types it's running with. However, it is slower and more complex.

Most compiler builds use InterpretMode.direct. To minimize the need for zig1.wasm updates, zig1.wasm uses InterpretMode.by_name. The mode of a normal compiler build can be changed with -Dvalue-interpret-mode.

Adding/Removing enum/union Fields

Adding and removing fields from enums or tagged unions in std.builtin will always work without a zig1.wasm update. This is by design: zig1.wasm uses the .by_name value interpret mode to make sure this works. The motivating use case is routine updates to things like CallingConvention and AddressSpace in response to target evolution.

After changing std.builtin, simply bootstrap again to get a working compiler.

Adding struct Fields

Adding fields to structs in std.builtin will work without a zig1.wasm update, as long as:

  • The compiler never uninterprets this type, or
  • The added field has a default value for the compiler to use when uninterpreting.

If neither of the above applies:

  1. Build a compiler with -Dvalue-interpret-mode=by_name.
  2. Give the field a default value, which it would be safe for the current compiler implementation to return when uninterpreting the type.
  3. Use the compiler built in step 1 to run update-zig1.
  4. Remove the default value.
  5. Bootstrap to get a final working compiler.

Removing struct Fields

Removing fields from structs in std.builtin will work without a zig1.wasm update only if the compiler never interprets this type. Otherwise, the field can be removed by this process:

  1. Give the field a default value, which it would be safe for the current compiler implementation to assume once the field is removed.
  2. Build a compiler with -Dvalue-interpret-mode=by_name.
  3. Remove the field in question.
  4. Use the compiler built in step 2 to run update-zig1.
  5. Bootstrap to get a final working compiler.

Renaming Fields

To rename fields on a std.builtin type (enum, union, or struct), while keeping field order, field types, backing types, and tag values intact:

  1. Build a compiler with -Dvalue-interpret-mode=direct (this is the default).
  2. Rename the field[s] in question.
  3. Use the compiler built in step 1 to run update-zig1.
  4. Bootstrap to get a final working compiler.

Other Changes

Any other change is a more "fundamental" change to a std.builtin type; like, for instance, when we turned CallingConvention from an enum into a tagged union. It will require a more involved update process. In short, you will need to keep both types intact to apply the initial zig1.wasm update: the old one so that the compiler updating zig1 can use it, and the new one for the new compiler code to reference.

Say you wish to change type std.builtin.Foo. The process is as follows.

  1. Rather than modifying Foo, define the new type under std.builtin.NewFoo. Leave the existing type definition untouched for now.
  2. Modify the compiler (and, where required to build the compiler, the standard library) to use NewFoo everywhere, and completely implement the new compiler behavior. However, when the compiler actually gets the corresponding Type from std.builtin at runtime, it should use the name "Foo", not "NewFoo". For instance, a Zcu.BuiltinDecl field should be called .Foo, not .NewFoo.
  3. Using an existing compiler, run update-zig1. Because the old Foo is intact, the compiler being used can reference it correctly; but the compiler being built uses the new type, currently named NewFoo.
  4. Delete the old Foo declaration, and rename NewFoo to Foo. Update all references to NewFoo to instead reference Foo.
  5. Bootstrap to get a final working compiler.