-
Notifications
You must be signed in to change notification settings - Fork 27
Bootstrapping
What does Multiplier's bootstrap process do, and why do we do it? Before reading this, you should read about PASTA's bootstrap process.
-
Creates the serializable form that can cover the various API surface areas of PASTA (
pasta::Decl
,pasta::Stmt
, etc.). This is saved inlib/AST.capnp
. Right now we use Cap'n Proto. The way we go about persistently representing objects is somewhat opaque. We have a concept of "typed slots." We roughly have one slot per method. Slots don't have meaningful names, because they can have different meanings in different classes. The key challenge is that we have a class hierarchy in PASTA (because of Clang), and a given object needs to be able to be casted up/down that hierarchy without changing. Cap'n Proto doesn't support subtyping, and it has a reader interface for things, and it wouldn't be safe or really even possible to do arbitrary pointer arithmetic on its underlying storage to be able to emulate changing the type we use to view the backing storage, so instead we opt for all entities in a given class hierarchy (e.g.Decl
s) uniformly see the same serialized representation. -
Creates code to persist the surface area of PASTA's AST API into a serializable form (i.e. Cap'n Proto). A serializer for a class needs to call the serializer for that class's base classes, if any. Then, it needs to call each method, and perform type-specific serialization of the method return value into a designated "slot" in the persistent representation. For example, if there is a method in pasta returning a
pasta::Decl
, then the persistent representation will use anindexer::EntityMapper
to look up themx::RawEntityId
(64 bit entity id) for that decl, and store that integer into the persistent form. It knows how to store a variety of types (enums, integers, optionals, vectors, etc.). -
Creates a "clone" of PASTA's API to access serialized data as though we were dealing with a normal object graph. Whereas in PASTA, the methods call their Clang counterparts, the implementation of multiplier API methods read their data out of the serialized storage form (i.e. Cap'n Proto). So, if you call
mx::Decl::canonical_declaration
, then that would previously correspond to apasta::Decl
, and in persistent form, amx::RawEntityId
, and so then the bootstrapedmx::Decl::canonical_declaration
method needs to read out this entity id from the slot corresponding tocanonical_declaration
, and then ask anmx::EntityProvider::DeclFor
to go and look up the entity id, so that we can return anmx::Decl
object. -
Removes some PASTA methods from Multiplier's API surface. Our approach to serialization is based on saturation: we call all the methods (for which we support serialization), and if the return value is an entity that we haven't seen yet, then we queue that up for serialization too. Some methods, like:
pasta::Type::WithConst
, could lead to us finding entities that aren't needed (i.e. aren't referenced by any thing in the actual code), and so we don't want to include these. -
Removes some unsafe PASTA methods from Multiplier's API surface. Some methods are impossible to always use right and lead to asserts. There is a blacklisting mechanism to remove these methods.
-
Canonicalizing eumerators to all be trivially enumerable, where their entries all have default values. This means generating "migrators" from PASTA enums to Multiplier ones.
-
Renaming things in the API to be consistent.
-
Adding in convenient methods, e.g.
mx::Type::tokens
. In Clang and PASTA, types don't have tokens, whereas in Multiplier they do. Multiplier types can all be rendered in a printable way, because we invent their tokens (when indexing) using thepasta::PrintedTokenRange::Create
API.