Skip to content

Commit

Permalink
[docs] update dual-core section
Browse files Browse the repository at this point in the history
- add inter-core communication
- update core 1 boot procedure
  • Loading branch information
stnolting committed Jan 4, 2025
1 parent 7e6a570 commit 2c07c53
Showing 1 changed file with 50 additions and 19 deletions.
69 changes: 50 additions & 19 deletions docs/datasheet/cpu_dual_core.adoc
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
:sectnums:
=== Dual-Core Configuration

.Hardware Requirements
[IMPORTANT]
The SMP dual-core configuration requires the <<_core_local_interruptor_clint>> to be implemented.
.Dual-Core Example
[TIP]
A simple dual-core example program can be found in `sw/example/demo_dual_core`.

Optionally, the CPU core can be implemented as **symmetric multiprocessing (SMP) dual-core** system.
This dual-core configuration is enabled by the `DUAL_CORE_EN` <<_processor_top_entity_generics, top generic>>.
Expand All @@ -14,10 +14,10 @@ of both core complexes are further switched into a single system bus using a rou

image::smp_system.png[align=center]

Both CPU cores are fully identical and use the same configuration provided by the according
<<_processor_top_entity_generics, top generics>>. However, each core can be identified by the according
"hart ID" that can be retrieved from the <<_mhartid>> CSR. CPU core 0 (the _primary_ core) has `mhartid = 0`
while core 1 (the _secondary_ core) has `mhartid = 1`.
Both CPU cores are fully identical and use the same ISA, tuning and cache configurations provided by the
according <<_processor_top_entity_generics, top generics>>. However, each core can be identified by the
according "hart ID" that can be retrieved from the <<_mhartid>> CSR. CPU core 0 (the _primary_ core) has
`mhartid = 0` while core 1 (the _secondary_ core) has `mhartid = 1`.

The following table summarizes the most important aspects when using the dual-core configuration.

Expand All @@ -41,32 +41,63 @@ while the top of stack of core 1 has to be explicitly defined by core 0 (see <<_
cores share the same heap, `.data` and `.bss` sections.
| **Constructors and destructors** | Constructors and destructors are executed on core 0 only.
(see )
| **Core communication** | See section <<_inter_core_communication_icc>>.
| **Bootloader** | Only core 0 will boot and execute the bootloader while core 1 is held in standby.
| **Booting** | See next section <<_dual_core_boot>>.
| **Booting** | See section <<_dual_core_boot>>.
|=======================

.Dual-Core Example

==== Inter-Core Communication (ICC)

Both cores can communicate with each other via a direct point-to-point connection based on FIFO-like message
queues. These direct communication links are faster (in terms of latency) compared to a memory-mapped or
shared-memory communication. Additionally, communication using these links is guaranteed to be atomic.

The inter-core communication (ICC) module is implemented as dedicated hardware module within each CPU core
(VHDL file `rtl/core/neorv32_cpu_icc.vhd`). This module is automatically included if the dual-core option
is enabled. Each core provides a 32-bit wide and 4 entries deep FIFO for sending data to the other core.
Hence, there are two FIFOs: one for sending data from core 0 to core 1 and another one for sending data the
opposite way.

The ICC communication links are accessed via NEORV32-specific CSRs. Hence, those FIFOs are accessible only
by the CPU core itself and cannot be accessed by the DMA or any other CPU core. In total, three CSRs are
provided to handle communications:

The <<_mxiccsr>> is used to select the core with which to communicate. In the dual-core configuration core 1
can only select core 0 and vice versa. The core selection in this register allows access to the according
message FIFOs via the two other CSRs. Additionally, the CSR provides status flags (TX FIFO data available;
RX FIFO free space) related to the selected communication link.

The <<_mxiccrxd>> and <<_mxicctxd>> CSRs are used for the actual data read and write operations. Writing data
to <<<<_mxicctxd>>> will send to the message queue of the core selected by <<_mxiccsr>>. Conversely, reading
data from <<_mxiccrxd>> will return data received from the core selected by <<_mxiccsr>>.

The ICC FIFOs do not provide any interrupt capabilities. Software is expected to use the machine-software
interrupt of the receiving core (provided by the <<_core_local_interruptor_clint>>) to inform it about
available messages.

.ICC Software API
[TIP]
A simple dual-core example setup / test program can be found in `sw/example/demo_dual_core`.
The NEORV32 software framework provides API wrappers to abstract inter-core communication:
`sw/lib/include/noevr32_smp.h`


==== Dual-Core Boot

After reset both cores start booting. However, core 1 will always (regardless of the boot configuration) enter
sleep mode inside the default <<_start_up_code_crt0>> that is linked with any compiled application. The primary
core (core 0) will continue booting executing either the <<_bootloader>> or the pre-installed image in the
internal instruction memory (depending on the <<_boot_configuration>>).
After reset, both cores start booting. However, core 1 will - regardless of the <<_boot_configuration>> - always
enter <<_sleep_mode>> right inside the default <<_start_up_code_crt0>> that is linked with any compiled
application. The primary core (core 0) will continue booting, executing either the <<_bootloader>> or the
pre-installed image from the internal instruction memory (depending on the boot configuration).

To boot-up core 1 the primary core has to use a special library function provided by the NEORV32 runtime
environment (RTE):
To boot-up core 1, the primary core has to use a special library function provided by the NEORV32 software framework:

.CPU Core 1 launch function prototype (note that this function can only be executed on core 0)
[source,c]
----
int neorv32_rte_smp_launch(void (*entry_point)(void), uint8_t* stack_memory, size_t stack_size_bytes);
int neorv32_smp_launch(int hart_id, void (*entry_point)(void), uint8_t* stack_memory, size_t stack_size_bytes);
----

When executed, core 0 will populate a configuration structure in main memory that contain the entry point
When executed, core 0 use the <<_inter_core_communication_icc>> to send launch data that includes the entry point
for core 1 (via `entry_point`) and the actual stack configuration (via `stack_memory` and `stack_size_bytes`).

.Core 1 Stack Memory
Expand All @@ -78,5 +109,5 @@ boundary.´

After that, the primary core triggers the _machine software interrupt_ of core 1 using the
<<_core_local_interruptor_clint>>. Core 1 wakes up from sleep mode, consumes the configuration structure and
finally starts executing at the provided entry point. When `neorv32_rte_smp_launch()` returns (with no error
finally starts executing at the provided entry point. When `neorv32_smp_launch()` returns (with no error
code) the secondary core is online and running.

0 comments on commit 2c07c53

Please # to comment.