From 207da72c6beae87012a46efd1e4c7e63e8af2dca Mon Sep 17 00:00:00 2001 From: Aleksey Kliger Date: Wed, 27 Mar 2024 15:32:27 -0400 Subject: [PATCH 01/11] [cdac] Physical contract descriptor spec --- .../datacontracts/contract-descriptor.md | 63 +++++++++++++++++++ docs/design/datacontracts/data_descriptor.md | 5 +- .../datacontracts/datacontracts_design.md | 2 +- 3 files changed, 66 insertions(+), 4 deletions(-) create mode 100644 docs/design/datacontracts/contract-descriptor.md diff --git a/docs/design/datacontracts/contract-descriptor.md b/docs/design/datacontracts/contract-descriptor.md new file mode 100644 index 00000000000000..c141d388e4d386 --- /dev/null +++ b/docs/design/datacontracts/contract-descriptor.md @@ -0,0 +1,63 @@ +# Contract Descritptor + +## Summary + +The [data contracts design](./datacontracts_design.md) is a mechanism that allows diagnostic tooling +to understand the behavior of certain .NET runtime subsystems and data structures. In a typical +scenario, a diagnostic tool such as a debugger may have access to a target .NET process (or a memory +dump of such a process) from which it may request to read and write certain regions of memory. + +This document describes a mechanism by which a diagnostic tool may acquire the following information: +* some details about the target process' architecture +* a collection of types and their sizes and/or the offsets of certain fields within each type +* a collection of global values +* a collection of /algorithmic contracts/ that are satisfied by the target process + +## Contract descriptor + +The contract descriptor consists of the follow structure. All multi-byte values are in target architecture endianness. + +```c +struct DotNetRuntimeContractDescriptor +{ + uint64_t magic; + uint32_t flags; + uint32_t aux_data_count; + char *data_descriptor; + uint64_t *aux_data; + char *compatible_contracts; +}; +``` + +The `magic` is `0x44_4e_43_43_44_41_43_00` ("DNCCDAC\0") stored using the target architecture +endianness. (N.B. this is sufficient to discover the target arhcitecture endianness by comparing the +value in memory to `0x44_4e_43_43_44_41_43_00` and to `0x00_43_41_44_43_43_4e_44`) + +Flags. The following bits are defined: + +| Bits 31-2 | Bit 1 | Bit 0 | +| --------- | ------- | ----- | +| Reserved | ptrSize | 1 | + +If `ptrSize` is 0, the architecture is 64-bit. If it is 1, the architecture is 32-bit. The +reserved bits should be written as zero. Diagnostic tooling may ignore non-zero reserved bits. + +The `data_descriptor` is a pointer to a json string described in [data descriptor physical layout](./data_descriptor.md#Physical_JSON_descriptor). + +The auxiliary data for the JSON descriptor is stored at the location `aux_data` in `aux_data_count` 64-bit slots. + +The `compatible_contracts` are a json string giving the [compatible contracts](./datacontracts_design.md#Compatible_Contract). The compatible contracts are given as a json array where each element is a dictionary. The dictionary will have a `c` key giving the name of the compatible contract as a string, and a `v` key giving the contract version as an integer. For example: + +``` jsonc +[{"c":"Thread","v":1},{"c":"GCHandle","v":1},...] +``` + +## Contract symbol + +To aid in the discovery of the contract descriptor, the contract should be exported by the target +process with the name `DotNetRuntimeContractDescriptor`. (Using the C symbol conventions of the +target platform. That is, on platforms where such symbols typicall have an `_` prepended, this +symbol should be exported as `_DotNetRuntimeContractDescriptor`) + +**FIXME** What about scenarios such as a NativeAOT library hosted inside a native process? What if +there are two such libraries? diff --git a/docs/design/datacontracts/data_descriptor.md b/docs/design/datacontracts/data_descriptor.md index cd0d5ce92e82c5..fcadbd403ac1ea 100644 --- a/docs/design/datacontracts/data_descriptor.md +++ b/docs/design/datacontracts/data_descriptor.md @@ -243,9 +243,8 @@ Rationale: This allows tooling to generate the in-memory data descriptor as a si string. For pointers, the address can be stored at a known offset in an in-proc array of pointers and the offset written into the constant JSON string. -The indirection array is not part of the data descriptor spec. It is expected that the data -contract descriptor will include it. (The data contract descriptor must contain: the data -descriptor, the set of compatible algorithmic contracts, the aux array of globals). +The indirection array is not part of the data descriptor spec. It is part of the [contract +descriptor](./contract_descriptor.md#Contract_descriptor). diff --git a/docs/design/datacontracts/datacontracts_design.md b/docs/design/datacontracts/datacontracts_design.md index f88e0abfd06e5a..1c131844c6e194 100644 --- a/docs/design/datacontracts/datacontracts_design.md +++ b/docs/design/datacontracts/datacontracts_design.md @@ -12,7 +12,7 @@ Diagnostic data contract addressed these challenges by eliminating the need for Data contracts represent the manner in which a tool which is not the runtime can reliably understand and observe the behavior of the runtime. Contracts are defined by their documentation, and the runtime describes what contracts are applicable to understanding that runtime. ## Data Contract Descriptor -The physical layout of this data is not defined in this document, but its practical effects are. +The physical layout of this data is defined in [the contract descriptor](./contract_descriptor.md) doc, its practical effects are discussed here. The Data Contract Descriptor has a set of records of the following forms. From 68a601ef3006d22ac113fc5074570c8da77582b0 Mon Sep 17 00:00:00 2001 From: Aleksey Kliger Date: Thu, 28 Mar 2024 13:10:30 -0400 Subject: [PATCH 02/11] Add "contracts" to the data descriptor; spec Unix-y weak symbol shennanigans --- .../datacontracts/contract-descriptor.md | 116 +++++++++++++++--- docs/design/datacontracts/data_descriptor.md | 4 + 2 files changed, 103 insertions(+), 17 deletions(-) diff --git a/docs/design/datacontracts/contract-descriptor.md b/docs/design/datacontracts/contract-descriptor.md index c141d388e4d386..5b6eb0bd5b6991 100644 --- a/docs/design/datacontracts/contract-descriptor.md +++ b/docs/design/datacontracts/contract-descriptor.md @@ -1,4 +1,4 @@ -# Contract Descritptor +# Contract Descriptor ## Summary @@ -21,43 +21,125 @@ The contract descriptor consists of the follow structure. All multi-byte values struct DotNetRuntimeContractDescriptor { uint64_t magic; - uint32_t flags; + uint32_t size_and_flags; uint32_t aux_data_count; - char *data_descriptor; + uint32_t descriptor_size; + uint32_t reserved; + char *descriptor; uint64_t *aux_data; - char *compatible_contracts; }; + +struct DotNetRuntimeContractDescriptorList +{ + struct DotNetRuntimeContractDescriptor descriptor; + struct DotNetRuntimeContractDescriptorList *next_runtime; +} ``` The `magic` is `0x44_4e_43_43_44_41_43_00` ("DNCCDAC\0") stored using the target architecture -endianness. (N.B. this is sufficient to discover the target arhcitecture endianness by comparing the +endianness. (N.B. this is sufficient to discover the target architecture endianness by comparing the value in memory to `0x44_4e_43_43_44_41_43_00` and to `0x00_43_41_44_43_43_4e_44`) Flags. The following bits are defined: -| Bits 31-2 | Bit 1 | Bit 0 | -| --------- | ------- | ----- | -| Reserved | ptrSize | 1 | +| Bits 31-3 | Bit 2 | Bit 1 | Bit 0 | +| --------- | ------ | ------- | ----- | +| Reserved | isList | ptrSize | 1 | If `ptrSize` is 0, the architecture is 64-bit. If it is 1, the architecture is 32-bit. The reserved bits should be written as zero. Diagnostic tooling may ignore non-zero reserved bits. -The `data_descriptor` is a pointer to a json string described in [data descriptor physical layout](./data_descriptor.md#Physical_JSON_descriptor). +If `isList` is 1, the descriptor is actually a `DotNetRuntimeContractDescriptorList` (that is, it +has a `next_runtime` field at the end. See "Unix symbol", below.) If `isList` is 0, the descriptor +does not have a `next_runtime` field. + +The `descriptor` is a pointer to a json string described in [data descriptor physical layout](./data_descriptor.md#Physical_JSON_descriptor). The total length (including nul terminator character) is given by `descriptor_size`. The auxiliary data for the JSON descriptor is stored at the location `aux_data` in `aux_data_count` 64-bit slots. -The `compatible_contracts` are a json string giving the [compatible contracts](./datacontracts_design.md#Compatible_Contract). The compatible contracts are given as a json array where each element is a dictionary. The dictionary will have a `c` key giving the name of the compatible contract as a string, and a `v` key giving the contract version as an integer. For example: +The `next_runtime` field is used to support multiple .NET runtimes in a single process. See below. + +### Compatible contracts + +The `descriptor` is a JSON dictionary that is used for storing the [in-memory data descriptor](./data_descriptor.md#Physical_JSON_Descriptor) +and the [compatible contracts](./datacontracts_design.md#Compatible_Contract). + +The compatible contracts are stored in the toplevel key `"contracts"`. The value will be a +dictionary that contains each contract name as a key. Each value is the version of the contract as +a JSON integer constant. + +**Contract example**: ``` jsonc -[{"c":"Thread","v":1},{"c":"GCHandle","v":1},...] +{"Thread":1,"GCHandle":1,...} ``` - + +**Complete in-memory data descriptor example**: + +``` jsonc +{ + "version": "0", + "baseline": "example-64", + "types": + { + "Thread": { "ThreadId": 32, "ThreadState": 0, "Next": 128 }, + "ThreadStore": { "ThreadCount": 32, "ThreadList": 8 } + }, + "globals": + { + "FEATURE_COMINTEROP": 0, + "s_pThreadStore": [ 0 ] // indirect from aux data offset 0 + } + "contracts": {"Thread": 1,"GCHandle": 1, "ThreadStore": 1} +} +``` + ## Contract symbol To aid in the discovery of the contract descriptor, the contract should be exported by the target -process with the name `DotNetRuntimeContractDescriptor`. (Using the C symbol conventions of the -target platform. That is, on platforms where such symbols typicall have an `_` prepended, this -symbol should be exported as `_DotNetRuntimeContractDescriptor`) +process with the name `DotNetRuntimeContractDescriptor`. + +The meaning of the symbol differs on Windows and non-Windows platforms. + +### Windows + +Multiple DLLs loaded by a process may host a single .NET runtime. Each DLL shall export the symbol +`DotNetRuntimeContractDescriptor` pointing to a `struct DotNetRuntimeContractDescriptor`. It is +expected that `isList` will be 0. + +### Non-Windows + +In a process, each shared object containing a .NET runtime shall weakly-export the symbol +`DotNetRuntimeContractDescriptor` (Using the C symbol conventions of the target platform. That is, +on platforms where such symbols typically have an `_` prepended, this symbol should be exported as +`_DotNetRuntimeContractDescriptor`) with a null initial value. As each .NET runtime in the process starts +up, it shall atomically store a pointer to a `struct DotNetRuntimeContractDescriptorList` in +`DotNetRuntimeContractDescriptor` where `next_runtime` points to the previous value of +`DotNetRuntimeContractDescriptor` as if by the following C code: + +``` c +typedef struct DotNetRuntimeContractDescriptorList* DescPtr; +typedef _Atomic(DescPtr) AtomicDescPtr; + +// global weak symbol +AtomicDescPtr __attribute__((weak)) DotNetRuntimeContractDescriptor; + +static const struct DotNetRuntimeContractDescriptor g_private_descriptor = { ... }; // predefined descriptor for current runtime + +// to be called at startup +void +install_descriptor(void) +{ + DescPtr descriptor = malloc(sizeof(struct DotNetRuntimeContractDescriptorList)); + assert (descriptor != NULL); + descriptor->descriptor = g_private_descriptor; // copy the constant values + descriptor->next_runtime = NULL; + + DescPtr prev = atomic_load(&DotNetRuntimeContractDescriptor); + do + { + descriptor->next_runtime = prev; + } while (!atomic_compare_exchange_weak(&DotNetRuntimeConctractDescriptor, &prev, descriptor)); +} +``` -**FIXME** What about scenarios such as a NativeAOT library hosted inside a native process? What if -there are two such libraries? diff --git a/docs/design/datacontracts/data_descriptor.md b/docs/design/datacontracts/data_descriptor.md index fcadbd403ac1ea..1338e1ae87aa60 100644 --- a/docs/design/datacontracts/data_descriptor.md +++ b/docs/design/datacontracts/data_descriptor.md @@ -130,6 +130,10 @@ The toplevel dictionary will contain: * `"types": TYPES_DESCRIPTOR` see below * `"globals": GLOBALS_DESCRIPTOR` see below +Additional toplevel keys may be present. For example, the in-memory data descriptor will contain a +`"contracts"` key (see [contract descriptor](./contract_descriptor.md#Compatible_contracts)) for the +set of compatible contracts. + ### Baseline data descriptor identifier The in-memory descriptor may contain an optional string identifying a well-known baseline From 5af68699ac8a76bd112fd5081db08fa81f2fce76 Mon Sep 17 00:00:00 2001 From: Aleksey Kliger Date: Thu, 28 Mar 2024 13:13:51 -0400 Subject: [PATCH 03/11] add a runtime name --- docs/design/datacontracts/contract-descriptor.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/docs/design/datacontracts/contract-descriptor.md b/docs/design/datacontracts/contract-descriptor.md index 5b6eb0bd5b6991..34cb8136d4cf5c 100644 --- a/docs/design/datacontracts/contract-descriptor.md +++ b/docs/design/datacontracts/contract-descriptor.md @@ -32,6 +32,7 @@ struct DotNetRuntimeContractDescriptor struct DotNetRuntimeContractDescriptorList { struct DotNetRuntimeContractDescriptor descriptor; + const char *runtime_name; struct DotNetRuntimeContractDescriptorList *next_runtime; } ``` @@ -117,6 +118,10 @@ up, it shall atomically store a pointer to a `struct DotNetRuntimeContractDescri `DotNetRuntimeContractDescriptor` where `next_runtime` points to the previous value of `DotNetRuntimeContractDescriptor` as if by the following C code: +The `runtime_name` is an arbitrary identifier to aid diagnostic tooling in identifying the current +runtime. (For example hosted runtimes may want to embed the name of the host; a desktop runtime may +use just the runtime flavor and version) + ``` c typedef struct DotNetRuntimeContractDescriptorList* DescPtr; typedef _Atomic(DescPtr) AtomicDescPtr; @@ -128,11 +133,12 @@ static const struct DotNetRuntimeContractDescriptor g_private_descriptor = { ... // to be called at startup void -install_descriptor(void) +install_descriptor(const char *runtime_name) { DescPtr descriptor = malloc(sizeof(struct DotNetRuntimeContractDescriptorList)); assert (descriptor != NULL); descriptor->descriptor = g_private_descriptor; // copy the constant values + descriptor->runtime_name = runtime_name; descriptor->next_runtime = NULL; DescPtr prev = atomic_load(&DotNetRuntimeContractDescriptor); From 4a3a0aef571cc75d7e9c9ac790243a143c2e9b46 Mon Sep 17 00:00:00 2001 From: Aleksey Kliger Date: Thu, 28 Mar 2024 13:17:36 -0400 Subject: [PATCH 04/11] reword and constness --- docs/design/datacontracts/contract-descriptor.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/design/datacontracts/contract-descriptor.md b/docs/design/datacontracts/contract-descriptor.md index 34cb8136d4cf5c..a04d07e28c1659 100644 --- a/docs/design/datacontracts/contract-descriptor.md +++ b/docs/design/datacontracts/contract-descriptor.md @@ -25,7 +25,7 @@ struct DotNetRuntimeContractDescriptor uint32_t aux_data_count; uint32_t descriptor_size; uint32_t reserved; - char *descriptor; + const char *descriptor; uint64_t *aux_data; }; @@ -51,7 +51,7 @@ If `ptrSize` is 0, the architecture is 64-bit. If it is 1, the architecture is reserved bits should be written as zero. Diagnostic tooling may ignore non-zero reserved bits. If `isList` is 1, the descriptor is actually a `DotNetRuntimeContractDescriptorList` (that is, it -has a `next_runtime` field at the end. See "Unix symbol", below.) If `isList` is 0, the descriptor +has a `next_runtime` field at the end. See "Non-Windows", below.) If `isList` is 0, the descriptor does not have a `next_runtime` field. The `descriptor` is a pointer to a json string described in [data descriptor physical layout](./data_descriptor.md#Physical_JSON_descriptor). The total length (including nul terminator character) is given by `descriptor_size`. @@ -90,7 +90,7 @@ a JSON integer constant. { "FEATURE_COMINTEROP": 0, "s_pThreadStore": [ 0 ] // indirect from aux data offset 0 - } + }, "contracts": {"Thread": 1,"GCHandle": 1, "ThreadStore": 1} } ``` @@ -116,7 +116,7 @@ on platforms where such symbols typically have an `_` prepended, this symbol sho `_DotNetRuntimeContractDescriptor`) with a null initial value. As each .NET runtime in the process starts up, it shall atomically store a pointer to a `struct DotNetRuntimeContractDescriptorList` in `DotNetRuntimeContractDescriptor` where `next_runtime` points to the previous value of -`DotNetRuntimeContractDescriptor` as if by the following C code: +`DotNetRuntimeContractDescriptor` as by the C code below. The `runtime_name` is an arbitrary identifier to aid diagnostic tooling in identifying the current runtime. (For example hosted runtimes may want to embed the name of the host; a desktop runtime may From 77afeac47e2b5a032fbfa77b0aa901669cf49c0e Mon Sep 17 00:00:00 2001 From: Aleksey Kliger Date: Fri, 29 Mar 2024 09:29:16 -0400 Subject: [PATCH 05/11] fixup example --- .../datacontracts/contract-descriptor.md | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/docs/design/datacontracts/contract-descriptor.md b/docs/design/datacontracts/contract-descriptor.md index a04d07e28c1659..4f725083361471 100644 --- a/docs/design/datacontracts/contract-descriptor.md +++ b/docs/design/datacontracts/contract-descriptor.md @@ -31,7 +31,7 @@ struct DotNetRuntimeContractDescriptor struct DotNetRuntimeContractDescriptorList { - struct DotNetRuntimeContractDescriptor descriptor; + const struct DotNetRuntimeContractDescriptor *descriptor; const char *runtime_name; struct DotNetRuntimeContractDescriptorList *next_runtime; } @@ -129,15 +129,21 @@ typedef _Atomic(DescPtr) AtomicDescPtr; // global weak symbol AtomicDescPtr __attribute__((weak)) DotNetRuntimeContractDescriptor; -static const struct DotNetRuntimeContractDescriptor g_private_descriptor = { ... }; // predefined descriptor for current runtime +// predefined descriptor for current runtime +static const struct DotNetRuntimeContractDescriptor g_private_descriptor = { ... }; + +// install_descriptor will try to assign the address of s_runtime_descriptor to the global symbol +static struct DotNetRuntimeContractDescriptorList s_runtime_descriptor = { + .descriptor = &g_private_descriptor, + .runtime_name = NULL, + .next_runtime = NULL +}; // to be called at startup void install_descriptor(const char *runtime_name) { - DescPtr descriptor = malloc(sizeof(struct DotNetRuntimeContractDescriptorList)); - assert (descriptor != NULL); - descriptor->descriptor = g_private_descriptor; // copy the constant values + DescPtr descriptor = &s_runtime_descriptor; descriptor->runtime_name = runtime_name; descriptor->next_runtime = NULL; @@ -145,7 +151,8 @@ install_descriptor(const char *runtime_name) do { descriptor->next_runtime = prev; - } while (!atomic_compare_exchange_weak(&DotNetRuntimeConctractDescriptor, &prev, descriptor)); + } + while (!atomic_compare_exchange_weak(&DotNetRuntimeContractDescriptor, &prev, descriptor)); } ``` From 7640997ab213b3857a35b718d264fb5eab42744b Mon Sep 17 00:00:00 2001 From: Aleksey Kliger Date: Fri, 29 Mar 2024 09:37:52 -0400 Subject: [PATCH 06/11] markdownlint --- docs/design/datacontracts/contract-descriptor.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/design/datacontracts/contract-descriptor.md b/docs/design/datacontracts/contract-descriptor.md index 4f725083361471..1a7421ef7a312d 100644 --- a/docs/design/datacontracts/contract-descriptor.md +++ b/docs/design/datacontracts/contract-descriptor.md @@ -8,7 +8,7 @@ scenario, a diagnostic tool such as a debugger may have access to a target .NET dump of such a process) from which it may request to read and write certain regions of memory. This document describes a mechanism by which a diagnostic tool may acquire the following information: -* some details about the target process' architecture +* some details about the target process' architecture * a collection of types and their sizes and/or the offsets of certain fields within each type * a collection of global values * a collection of /algorithmic contracts/ that are satisfied by the target process @@ -16,7 +16,7 @@ This document describes a mechanism by which a diagnostic tool may acquire the f ## Contract descriptor The contract descriptor consists of the follow structure. All multi-byte values are in target architecture endianness. - + ```c struct DotNetRuntimeContractDescriptor { From a63782d3c872d1ba5d402ebbae9893111612f9b6 Mon Sep 17 00:00:00 2001 From: Aleksey Kliger Date: Fri, 29 Mar 2024 09:41:53 -0400 Subject: [PATCH 07/11] don't use a pointer the issue is that we can't dereference pointers until we know the target endianness and pointer size. which we can only discover by reading the actual descriptor magic and flags --- docs/design/datacontracts/contract-descriptor.md | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/docs/design/datacontracts/contract-descriptor.md b/docs/design/datacontracts/contract-descriptor.md index 1a7421ef7a312d..6414fb48880ce3 100644 --- a/docs/design/datacontracts/contract-descriptor.md +++ b/docs/design/datacontracts/contract-descriptor.md @@ -31,7 +31,7 @@ struct DotNetRuntimeContractDescriptor struct DotNetRuntimeContractDescriptorList { - const struct DotNetRuntimeContractDescriptor *descriptor; + struct DotNetRuntimeContractDescriptor descriptor; const char *runtime_name; struct DotNetRuntimeContractDescriptorList *next_runtime; } @@ -133,17 +133,15 @@ AtomicDescPtr __attribute__((weak)) DotNetRuntimeContractDescriptor; static const struct DotNetRuntimeContractDescriptor g_private_descriptor = { ... }; // install_descriptor will try to assign the address of s_runtime_descriptor to the global symbol -static struct DotNetRuntimeContractDescriptorList s_runtime_descriptor = { - .descriptor = &g_private_descriptor, - .runtime_name = NULL, - .next_runtime = NULL -}; +static struct DotNetRuntimeContractDescriptorList s_runtime_descriptor = {0,}; // to be called at startup void install_descriptor(const char *runtime_name) { DescPtr descriptor = &s_runtime_descriptor; + // initialize with a copy of the predefined descriptor data + descriptor->descriptor = g_private_descriptor; descriptor->runtime_name = runtime_name; descriptor->next_runtime = NULL; From 2a23d12d361a619cba7dc0e9d13c577769785b38 Mon Sep 17 00:00:00 2001 From: Aleksey Kliger Date: Fri, 29 Mar 2024 09:43:49 -0400 Subject: [PATCH 08/11] flags not size_and_flags --- docs/design/datacontracts/contract-descriptor.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/design/datacontracts/contract-descriptor.md b/docs/design/datacontracts/contract-descriptor.md index 6414fb48880ce3..bc8f47ecc7138e 100644 --- a/docs/design/datacontracts/contract-descriptor.md +++ b/docs/design/datacontracts/contract-descriptor.md @@ -21,7 +21,7 @@ The contract descriptor consists of the follow structure. All multi-byte values struct DotNetRuntimeContractDescriptor { uint64_t magic; - uint32_t size_and_flags; + uint32_t flags; uint32_t aux_data_count; uint32_t descriptor_size; uint32_t reserved; @@ -41,7 +41,7 @@ The `magic` is `0x44_4e_43_43_44_41_43_00` ("DNCCDAC\0") stored using the target endianness. (N.B. this is sufficient to discover the target architecture endianness by comparing the value in memory to `0x44_4e_43_43_44_41_43_00` and to `0x00_43_41_44_43_43_4e_44`) -Flags. The following bits are defined: +The following `flags` bits are defined: | Bits 31-3 | Bit 2 | Bit 1 | Bit 0 | | --------- | ------ | ------- | ----- | @@ -58,7 +58,7 @@ The `descriptor` is a pointer to a json string described in [data descriptor phy The auxiliary data for the JSON descriptor is stored at the location `aux_data` in `aux_data_count` 64-bit slots. -The `next_runtime` field is used to support multiple .NET runtimes in a single process. See below. +The `runtime_name` and `next_runtime` fields are used to support multiple .NET runtimes in a single process. See below. ### Compatible contracts From e3d41c600222d805383142137c497877cafac494 Mon Sep 17 00:00:00 2001 From: Aleksey Kliger Date: Mon, 1 Apr 2024 13:29:15 -0400 Subject: [PATCH 09/11] remove DotNetRuntimeContractDescriptorList - one runtime per module if there are multiple hosted runtimes, diagnostic tooling should look in each loaded module to discover the contract descriptor --- .../datacontracts/contract-descriptor.md | 87 ++++--------------- 1 file changed, 16 insertions(+), 71 deletions(-) diff --git a/docs/design/datacontracts/contract-descriptor.md b/docs/design/datacontracts/contract-descriptor.md index bc8f47ecc7138e..c4f1c6279391a1 100644 --- a/docs/design/datacontracts/contract-descriptor.md +++ b/docs/design/datacontracts/contract-descriptor.md @@ -28,13 +28,6 @@ struct DotNetRuntimeContractDescriptor const char *descriptor; uint64_t *aux_data; }; - -struct DotNetRuntimeContractDescriptorList -{ - struct DotNetRuntimeContractDescriptor descriptor; - const char *runtime_name; - struct DotNetRuntimeContractDescriptorList *next_runtime; -} ``` The `magic` is `0x44_4e_43_43_44_41_43_00` ("DNCCDAC\0") stored using the target architecture @@ -43,22 +36,23 @@ value in memory to `0x44_4e_43_43_44_41_43_00` and to `0x00_43_41_44_43_43_4e_44 The following `flags` bits are defined: -| Bits 31-3 | Bit 2 | Bit 1 | Bit 0 | -| --------- | ------ | ------- | ----- | -| Reserved | isList | ptrSize | 1 | +| Bits 31-2 | Bit 1 | Bit 0 | +| --------- | ------- | ----- | +| Reserved | ptrSize | 1 | If `ptrSize` is 0, the architecture is 64-bit. If it is 1, the architecture is 32-bit. The reserved bits should be written as zero. Diagnostic tooling may ignore non-zero reserved bits. -If `isList` is 1, the descriptor is actually a `DotNetRuntimeContractDescriptorList` (that is, it -has a `next_runtime` field at the end. See "Non-Windows", below.) If `isList` is 0, the descriptor -does not have a `next_runtime` field. - -The `descriptor` is a pointer to a json string described in [data descriptor physical layout](./data_descriptor.md#Physical_JSON_descriptor). The total length (including nul terminator character) is given by `descriptor_size`. +The `descriptor` is a pointer to a JSON string described in [data descriptor physical layout](./data_descriptor.md#Physical_JSON_descriptor). The total length (including nul terminator character) is given by `descriptor_size`. The auxiliary data for the JSON descriptor is stored at the location `aux_data` in `aux_data_count` 64-bit slots. -The `runtime_name` and `next_runtime` fields are used to support multiple .NET runtimes in a single process. See below. +### Architecture properties + +Although `DotNetRuntimeContractDescriptor` contains enough information to discover the target +architecture endianness pointer size, it is expected that in all scenarios diagnostic tooling will +already have this information available through other channels. Diagnostic tools may use the +information derived from `DotNetRuntimeContractDescriptor` for validation. ### Compatible contracts @@ -97,60 +91,11 @@ a JSON integer constant. ## Contract symbol -To aid in the discovery of the contract descriptor, the contract should be exported by the target -process with the name `DotNetRuntimeContractDescriptor`. - -The meaning of the symbol differs on Windows and non-Windows platforms. - -### Windows - -Multiple DLLs loaded by a process may host a single .NET runtime. Each DLL shall export the symbol -`DotNetRuntimeContractDescriptor` pointing to a `struct DotNetRuntimeContractDescriptor`. It is -expected that `isList` will be 0. - -### Non-Windows +To aid in the discovery the contract descriptor should be exported by the module hosting the .NET +runtime with the name `DotNetRuntimeContractDescriptor`. (Using the C symbol conventions of the +target platform. That is, on platforms where such symbols typically have an `_` prepended, this +symbol should be exported as `_DotNetRuntimeContractDescriptor`). -In a process, each shared object containing a .NET runtime shall weakly-export the symbol -`DotNetRuntimeContractDescriptor` (Using the C symbol conventions of the target platform. That is, -on platforms where such symbols typically have an `_` prepended, this symbol should be exported as -`_DotNetRuntimeContractDescriptor`) with a null initial value. As each .NET runtime in the process starts -up, it shall atomically store a pointer to a `struct DotNetRuntimeContractDescriptorList` in -`DotNetRuntimeContractDescriptor` where `next_runtime` points to the previous value of -`DotNetRuntimeContractDescriptor` as by the C code below. - -The `runtime_name` is an arbitrary identifier to aid diagnostic tooling in identifying the current -runtime. (For example hosted runtimes may want to embed the name of the host; a desktop runtime may -use just the runtime flavor and version) - -``` c -typedef struct DotNetRuntimeContractDescriptorList* DescPtr; -typedef _Atomic(DescPtr) AtomicDescPtr; - -// global weak symbol -AtomicDescPtr __attribute__((weak)) DotNetRuntimeContractDescriptor; - -// predefined descriptor for current runtime -static const struct DotNetRuntimeContractDescriptor g_private_descriptor = { ... }; - -// install_descriptor will try to assign the address of s_runtime_descriptor to the global symbol -static struct DotNetRuntimeContractDescriptorList s_runtime_descriptor = {0,}; - -// to be called at startup -void -install_descriptor(const char *runtime_name) -{ - DescPtr descriptor = &s_runtime_descriptor; - // initialize with a copy of the predefined descriptor data - descriptor->descriptor = g_private_descriptor; - descriptor->runtime_name = runtime_name; - descriptor->next_runtime = NULL; - - DescPtr prev = atomic_load(&DotNetRuntimeContractDescriptor); - do - { - descriptor->next_runtime = prev; - } - while (!atomic_compare_exchange_weak(&DotNetRuntimeContractDescriptor, &prev, descriptor)); -} -``` +In scenarios where multiple .NET runtimes may be present in a single process, diagnostic tooling +should look for the symbol in each loaded module to discover all the runtimes. From 8c275931479bb898668bcd35027aeae853c512a0 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Aleksey=20Kliger=20=28=CE=BBgeek=29?= Date: Thu, 4 Apr 2024 10:29:31 -0400 Subject: [PATCH 10/11] Apply suggestions from code review Co-authored-by: Elinor Fung --- docs/design/datacontracts/contract-descriptor.md | 14 +++++++------- docs/design/datacontracts/datacontracts_design.md | 2 +- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/docs/design/datacontracts/contract-descriptor.md b/docs/design/datacontracts/contract-descriptor.md index c4f1c6279391a1..77add170d90b64 100644 --- a/docs/design/datacontracts/contract-descriptor.md +++ b/docs/design/datacontracts/contract-descriptor.md @@ -31,8 +31,8 @@ struct DotNetRuntimeContractDescriptor ``` The `magic` is `0x44_4e_43_43_44_41_43_00` ("DNCCDAC\0") stored using the target architecture -endianness. (N.B. this is sufficient to discover the target architecture endianness by comparing the -value in memory to `0x44_4e_43_43_44_41_43_00` and to `0x00_43_41_44_43_43_4e_44`) +endianness. This is sufficient to discover the target architecture endianness by comparing the +value in memory to `0x44_4e_43_43_44_41_43_00` and to `0x00_43_41_44_43_43_4e_44`. The following `flags` bits are defined: @@ -59,7 +59,7 @@ information derived from `DotNetRuntimeContractDescriptor` for validation. The `descriptor` is a JSON dictionary that is used for storing the [in-memory data descriptor](./data_descriptor.md#Physical_JSON_Descriptor) and the [compatible contracts](./datacontracts_design.md#Compatible_Contract). -The compatible contracts are stored in the toplevel key `"contracts"`. The value will be a +The compatible contracts are stored in the top-level key `"contracts"`. The value will be a dictionary that contains each contract name as a key. Each value is the version of the contract as a JSON integer constant. @@ -85,16 +85,16 @@ a JSON integer constant. "FEATURE_COMINTEROP": 0, "s_pThreadStore": [ 0 ] // indirect from aux data offset 0 }, - "contracts": {"Thread": 1,"GCHandle": 1, "ThreadStore": 1} + "contracts": {"Thread": 1, "GCHandle": 1, "ThreadStore": 1} } ``` ## Contract symbol -To aid in the discovery the contract descriptor should be exported by the module hosting the .NET -runtime with the name `DotNetRuntimeContractDescriptor`. (Using the C symbol conventions of the +To aid in discovery, the contract descriptor should be exported by the module hosting the .NET +runtime with the name `DotNetRuntimeContractDescriptor` using the C symbol conventions of the target platform. That is, on platforms where such symbols typically have an `_` prepended, this -symbol should be exported as `_DotNetRuntimeContractDescriptor`). +symbol should be exported as `_DotNetRuntimeContractDescriptor`. In scenarios where multiple .NET runtimes may be present in a single process, diagnostic tooling should look for the symbol in each loaded module to discover all the runtimes. diff --git a/docs/design/datacontracts/datacontracts_design.md b/docs/design/datacontracts/datacontracts_design.md index 1c131844c6e194..630dc9fc5639e1 100644 --- a/docs/design/datacontracts/datacontracts_design.md +++ b/docs/design/datacontracts/datacontracts_design.md @@ -12,7 +12,7 @@ Diagnostic data contract addressed these challenges by eliminating the need for Data contracts represent the manner in which a tool which is not the runtime can reliably understand and observe the behavior of the runtime. Contracts are defined by their documentation, and the runtime describes what contracts are applicable to understanding that runtime. ## Data Contract Descriptor -The physical layout of this data is defined in [the contract descriptor](./contract_descriptor.md) doc, its practical effects are discussed here. +The physical layout of this data is defined in [the contract descriptor](./contract_descriptor.md) doc. Its practical effects are discussed here. The Data Contract Descriptor has a set of records of the following forms. From 1553aa5dbe441465e5eda7567f6e33096b668d76 Mon Sep 17 00:00:00 2001 From: Aleksey Kliger Date: Thu, 4 Apr 2024 10:35:01 -0400 Subject: [PATCH 11/11] Review feedback - put the aux data and descriptor sizes closer to the pointers - Don't include trailing nul `descriptor_size`. Clarify it is counting bytes and that `descriptor` is in UTF-8 - Simplify `DotNetRuntimeContractDescriptor` naming discussion --- docs/design/datacontracts/contract-descriptor.md | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/docs/design/datacontracts/contract-descriptor.md b/docs/design/datacontracts/contract-descriptor.md index 77add170d90b64..1e3ddabd6dd735 100644 --- a/docs/design/datacontracts/contract-descriptor.md +++ b/docs/design/datacontracts/contract-descriptor.md @@ -22,11 +22,11 @@ struct DotNetRuntimeContractDescriptor { uint64_t magic; uint32_t flags; - uint32_t aux_data_count; uint32_t descriptor_size; - uint32_t reserved; const char *descriptor; - uint64_t *aux_data; + uint32_t aux_data_count; + uint32_t pad0; + uintptr_t *aux_data; }; ``` @@ -43,9 +43,9 @@ The following `flags` bits are defined: If `ptrSize` is 0, the architecture is 64-bit. If it is 1, the architecture is 32-bit. The reserved bits should be written as zero. Diagnostic tooling may ignore non-zero reserved bits. -The `descriptor` is a pointer to a JSON string described in [data descriptor physical layout](./data_descriptor.md#Physical_JSON_descriptor). The total length (including nul terminator character) is given by `descriptor_size`. +The `descriptor` is a pointer to a UTF-8 JSON string described in [data descriptor physical layout](./data_descriptor.md#Physical_JSON_descriptor). The total number of bytes is given by `descriptor_size`. -The auxiliary data for the JSON descriptor is stored at the location `aux_data` in `aux_data_count` 64-bit slots. +The auxiliary data for the JSON descriptor is stored at the location `aux_data` in `aux_data_count` pointer-sized slots. ### Architecture properties @@ -92,9 +92,8 @@ a JSON integer constant. ## Contract symbol To aid in discovery, the contract descriptor should be exported by the module hosting the .NET -runtime with the name `DotNetRuntimeContractDescriptor` using the C symbol conventions of the -target platform. That is, on platforms where such symbols typically have an `_` prepended, this -symbol should be exported as `_DotNetRuntimeContractDescriptor`. +runtime with the name `DotNetRuntimeContractDescriptor` using the C symbol naming conventions of the +target platform. In scenarios where multiple .NET runtimes may be present in a single process, diagnostic tooling should look for the symbol in each loaded module to discover all the runtimes.