diff --git a/docs/datasheet/cpu_dual_core.adoc b/docs/datasheet/cpu_dual_core.adoc index 1e6acec4a..a70e0ed0f 100644 --- a/docs/datasheet/cpu_dual_core.adoc +++ b/docs/datasheet/cpu_dual_core.adoc @@ -1,9 +1,9 @@ :sectnums: === Dual-Core Configuration -.Hardware Requirements -[IMPORTANT] -The SMP dual-core configuration requires the <<_core_local_interruptor_clint>> to be implemented. +.Dual-Core Example +[TIP] +A simple dual-core example program can be found in `sw/example/demo_dual_core`. Optionally, the CPU core can be implemented as **symmetric multiprocessing (SMP) dual-core** system. This dual-core configuration is enabled by the `DUAL_CORE_EN` <<_processor_top_entity_generics, top generic>>. @@ -14,10 +14,10 @@ of both core complexes are further switched into a single system bus using a rou image::smp_system.png[align=center] -Both CPU cores are fully identical and use the same configuration provided by the according -<<_processor_top_entity_generics, top generics>>. However, each core can be identified by the according -"hart ID" that can be retrieved from the <<_mhartid>> CSR. CPU core 0 (the _primary_ core) has `mhartid = 0` -while core 1 (the _secondary_ core) has `mhartid = 1`. +Both CPU cores are fully identical and use the same ISA, tuning and cache configurations provided by the +according <<_processor_top_entity_generics, top generics>>. However, each core can be identified by the +according "hart ID" that can be retrieved from the <<_mhartid>> CSR. CPU core 0 (the _primary_ core) has +`mhartid = 0` while core 1 (the _secondary_ core) has `mhartid = 1`. The following table summarizes the most important aspects when using the dual-core configuration. @@ -41,32 +41,63 @@ while the top of stack of core 1 has to be explicitly defined by core 0 (see <<_ cores share the same heap, `.data` and `.bss` sections. | **Constructors and destructors** | Constructors and destructors are executed on core 0 only. (see ) +| **Core communication** | See section <<_inter_core_communication_icc>>. | **Bootloader** | Only core 0 will boot and execute the bootloader while core 1 is held in standby. -| **Booting** | See next section <<_dual_core_boot>>. +| **Booting** | See section <<_dual_core_boot>>. |======================= -.Dual-Core Example + +==== Inter-Core Communication (ICC) + +Both cores can communicate with each other via a direct point-to-point connection based on FIFO-like message +queues. These direct communication links are faster (in terms of latency) compared to a memory-mapped or +shared-memory communication. Additionally, communication using these links is guaranteed to be atomic. + +The inter-core communication (ICC) module is implemented as dedicated hardware module within each CPU core +(VHDL file `rtl/core/neorv32_cpu_icc.vhd`). This module is automatically included if the dual-core option +is enabled. Each core provides a 32-bit wide and 4 entries deep FIFO for sending data to the other core. +Hence, there are two FIFOs: one for sending data from core 0 to core 1 and another one for sending data the +opposite way. + +The ICC communication links are accessed via NEORV32-specific CSRs. Hence, those FIFOs are accessible only +by the CPU core itself and cannot be accessed by the DMA or any other CPU core. In total, three CSRs are +provided to handle communications: + +The <<_mxiccsr>> is used to select the core with which to communicate. In the dual-core configuration core 1 +can only select core 0 and vice versa. The core selection in this register allows access to the according +message FIFOs via the two other CSRs. Additionally, the CSR provides status flags (TX FIFO data available; +RX FIFO free space) related to the selected communication link. + +The <<_mxiccrxd>> and <<_mxicctxd>> CSRs are used for the actual data read and write operations. Writing data +to <<<<_mxicctxd>>> will send to the message queue of the core selected by <<_mxiccsr>>. Conversely, reading +data from <<_mxiccrxd>> will return data received from the core selected by <<_mxiccsr>>. + +The ICC FIFOs do not provide any interrupt capabilities. Software is expected to use the machine-software +interrupt of the receiving core (provided by the <<_core_local_interruptor_clint>>) to inform it about +available messages. + +.ICC Software API [TIP] -A simple dual-core example setup / test program can be found in `sw/example/demo_dual_core`. +The NEORV32 software framework provides API wrappers to abstract inter-core communication: +`sw/lib/include/noevr32_smp.h` ==== Dual-Core Boot -After reset both cores start booting. However, core 1 will always (regardless of the boot configuration) enter -sleep mode inside the default <<_start_up_code_crt0>> that is linked with any compiled application. The primary -core (core 0) will continue booting executing either the <<_bootloader>> or the pre-installed image in the -internal instruction memory (depending on the <<_boot_configuration>>). +After reset, both cores start booting. However, core 1 will - regardless of the <<_boot_configuration>> - always +enter <<_sleep_mode>> right inside the default <<_start_up_code_crt0>> that is linked with any compiled +application. The primary core (core 0) will continue booting, executing either the <<_bootloader>> or the +pre-installed image from the internal instruction memory (depending on the boot configuration). -To boot-up core 1 the primary core has to use a special library function provided by the NEORV32 runtime -environment (RTE): +To boot-up core 1, the primary core has to use a special library function provided by the NEORV32 software framework: .CPU Core 1 launch function prototype (note that this function can only be executed on core 0) [source,c] ---- -int neorv32_rte_smp_launch(void (*entry_point)(void), uint8_t* stack_memory, size_t stack_size_bytes); +int neorv32_smp_launch(int hart_id, void (*entry_point)(void), uint8_t* stack_memory, size_t stack_size_bytes); ---- -When executed, core 0 will populate a configuration structure in main memory that contain the entry point +When executed, core 0 use the <<_inter_core_communication_icc>> to send launch data that includes the entry point for core 1 (via `entry_point`) and the actual stack configuration (via `stack_memory` and `stack_size_bytes`). .Core 1 Stack Memory @@ -78,5 +109,5 @@ boundary.ยด After that, the primary core triggers the _machine software interrupt_ of core 1 using the <<_core_local_interruptor_clint>>. Core 1 wakes up from sleep mode, consumes the configuration structure and -finally starts executing at the provided entry point. When `neorv32_rte_smp_launch()` returns (with no error +finally starts executing at the provided entry point. When `neorv32_smp_launch()` returns (with no error code) the secondary core is online and running.