diff --git a/CHANGELOG.md b/CHANGELOG.md index 1ae253459..a23a992e9 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -30,6 +30,7 @@ mimpid = 0x01040312 -> Version 01.04.03.12 -> v1.4.3.12 | Date | Version | Comment | Link | |:----:|:-------:|:--------|:----:| +| 14.03.2024 | 1.9.6.5 | :sparkles: add optional external bus interface cache (XCACHE) | [#]846(https://github.com/stnolting/neorv32/pull/849) | | 12.03.2024 | 1.9.6.4 | :warning: :warning: rename external bus/memory interface and according generics ("WISHBONE/MEM_EXT" -> "XBUS"); also rename bus interface ports (`wb_* -> xbus_*`) | [#846](https://github.com/stnolting/neorv32/pull/846) | | 11.03.2024 | 1.9.6.3 | :warning: remove Wishbone tag signal; minor rtl edits and optimizations | [#845](https://github.com/stnolting/neorv32/pull/845) | | 10.03.2024 | 1.9.6.2 | minor rtl clean-ups, optimizations and fixes | [#843](https://github.com/stnolting/neorv32/pull/843) | diff --git a/README.md b/README.md index d0e451052..537fca1fb 100644 --- a/README.md +++ b/README.md @@ -149,7 +149,7 @@ allows booting application code via UART or from external SPI flash * standard serial interfaces ([UART](https://stnolting.github.io/neorv32/#_primary_universal_asynchronous_receiver_and_transmitter_uart0), -[SPI](https://stnolting.github.io/neorv32/#_serial_peripheral_interface_controller_spi) (host), +[SPI](https://stnolting.github.io/neorv32/#_serial_peripheral_interface_controller_spi) (SPI host), [SDI](https://stnolting.github.io/neorv32/#_serial_data_interface_controller_sdi) (SPI device), [TWI/I²C](https://stnolting.github.io/neorv32/#_two_wire_serial_interface_controller_twi), [ONEWIRE/1-Wire](https://stnolting.github.io/neorv32/#_one_wire_serial_interface_controller_onewire)) @@ -160,7 +160,7 @@ allows booting application code via UART or from external SPI flash **SoC Connectivity** * 32-bit external bus interface - Wishbone b4 compatible -([XBUS](https://stnolting.github.io/neorv32/#_processor_external_bus_interface_xbus)); +([XBUS](https://stnolting.github.io/neorv32/#_processor_external_bus_interface_xbus)) with optional cache (XCACHE); [wrappers](https://github.com/stnolting/neorv32/blob/main/rtl/system_integration) for AXI4-Lite and Avalon-MM host interfaces * stream link interface with independent RX and TX channels - AXI4-Stream compatible ([SLINK](https://stnolting.github.io/neorv32/#_stream_link_interface_slink)) diff --git a/docs/datasheet/overview.adoc b/docs/datasheet/overview.adoc index 5762e2565..f12e855bc 100644 --- a/docs/datasheet/overview.adoc +++ b/docs/datasheet/overview.adoc @@ -210,6 +210,7 @@ All core VHDL files from the list below have to be assigned to a **new library** │ ├neorv32_cfs.vhd - Custom functions subsystem ├neorv32_crc.vhd - Cyclic redundancy check unit +├neorv32_cache.vhd - Generic cache module ├neorv32_dcache.vhd - Processor-internal data cache ├neorv32_debug_dm.vhd - on-chip debugger: debug module ├neorv32_debug_dtm.vhd - on-chip debugger: debug transfer module @@ -231,7 +232,7 @@ All core VHDL files from the list below have to be assigned to a **new library** ├neorv32_twi.vhd - Two wire serial interface controller ├neorv32_uart.vhd - Universal async. receiver/transmitter ├neorv32_wdt.vhd - Watchdog timer -├neorv32_wishbone.vhd - External (Wishbone) bus interface +├neorv32_xbus.vhd - External (Wishbone) bus interface gateways ├neorv32_xip.vhd - Execute in place module ├neorv32_xirq.vhd - External interrupt controller │ diff --git a/docs/datasheet/soc.adoc b/docs/datasheet/soc.adoc index 008f3d0f5..068ef020f 100644 --- a/docs/datasheet/soc.adoc +++ b/docs/datasheet/soc.adoc @@ -20,7 +20,7 @@ image::neorv32_processor.png[align=center] **Key Features** * _optional_ processor-internal data and instruction memories (<<_data_memory_dmem,**DMEM**>>/<<_instruction_memory_imem,**IMEM**>>) -* _optional_ caches (<<_processor_internal_instruction_cache_icache,**iCACHE**>>/<<_processor_internal_data_cache_dcache,**dCACHE**>>) +* _optional_ caches (<<_processor_internal_instruction_cache_icache,**iCACHE**>>, <<_processor_internal_data_cache_dcache,**dCACHE**>, <<_execute_in_place_module_xip,**xipCACHE**>, <<_processor_external_bus_interface_xbus,**xCACHE**>>) * _optional_ internal bootloader (<<_bootloader_rom_bootrom,**BOOTROM**>>) with UART console & SPI flash boot option * _optional_ machine system timer (<<_machine_system_timer_mtime,**MTIME**>>), RISC-V-compatible * _optional_ two independent universal asynchronous receivers and transmitters (<<_primary_universal_asynchronous_receiver_and_transmitter_uart0,**UART0**>>, @@ -242,12 +242,12 @@ The generic type "`suv(x:y)`" is an abbreviation for "`std_ulogic_vector(x downt | `MEM_INT_DMEM_SIZE` | natural | 8*1024 | Size in bytes of the processor-internal data memory (use a power of 2). 4+^| **<<_processor_internal_instruction_cache_icache>>** | `ICACHE_EN` | boolean | false | Implement the instruction cache. -| `ICACHE_NUM_BLOCKS` | natural | 4 | Number of blocks ("pages" or "lines") Has to be a power of two. +| `ICACHE_NUM_BLOCKS` | natural | 4 | Number of blocks ("lines") Has to be a power of two. | `ICACHE_BLOCK_SIZE` | natural | 64 | Size in bytes of each block. Has to be a power of two. | `ICACHE_ASSOCIATIVITY` | natural | 1 | Associativity (number of sets). Allowed configurations: `1` = 1 set, direct mapped; `2` = 2-way set-associative. 4+^| **<<_processor_internal_data_cache_dcache>>** | `DCACHE_EN` | boolean | false | Implement the data cache. -| `DCACHE_NUM_BLOCKS` | natural | 4 | Number of blocks ("pages" or "lines"). Has to be a power of two. +| `DCACHE_NUM_BLOCKS` | natural | 4 | Number of blocks ("lines"). Has to be a power of two. | `DCACHE_BLOCK_SIZE` | natural | 64 | Size in bytes of each block. Has to be a power of two. 4+^| **<<_processor_external_bus_interface_xbus>> (Wishbone b4 protocol)** | `XBUS_EN` | boolean | false | Implement the external bus interface. @@ -256,6 +256,9 @@ The generic type "`suv(x:y)`" is an abbreviation for "`std_ulogic_vector(x downt | `XBUS_BIG_ENDIAN` | boolean | false | Use BIG endian data order interface for external bus. | `XBUS_ASYNC_RX` | boolean | false | Disable input registers when true. | `XBUS_ASYNC_TX` | boolean | false | Disable output registers when true. +| `XBUS_CACHE_EN` | boolean | false | Implement the external bus cache. +| `XBUS_CACHE_NUM_BLOCKS` | natural | 64 | Number of blocks ("lines"). Has to be a power of two. +| `XBUS_CACHE_BLOCK_SIZE` | natural | 32 | Size in bytes of each block. Has to be a power of two. 4+^| **<<_execute_in_place_module_xip>>** | `XIP_EN` | boolean | false | Implement the execute in-place module. | `XIP_CACHE_EN` | boolean | false | Implement XIP cache. @@ -558,7 +561,6 @@ Accesses that are delegated to the external bus interface have a different maxim explicit specific processor generic. See section <<_processor_external_bus_interface_xbus>> for more information. - :sectnums: ==== Reservation Set Controller diff --git a/docs/datasheet/soc_xbus.adoc b/docs/datasheet/soc_xbus.adoc index 5edf6c0ae..e55389053 100644 --- a/docs/datasheet/soc_xbus.adoc +++ b/docs/datasheet/soc_xbus.adoc @@ -5,23 +5,27 @@ [cols="<3,<3,<4"] [frame="topbot",grid="none"] |======================= -| Hardware source file(s): | neorv32_xbus.vhd | -| Software driver file(s): | none | _implicitly used_ -| Top entity port: | `xbus_adr_o` | address output (32-bit) -| | `xbus_dat_i` | data input (32-bit) -| | `xbus_dat_o` | data output (32-bit) -| | `xbus_we_o` | write enable (1-bit) -| | `xbus_sel_o` | byte enable (4-bit) -| | `xbus_stb_o` | strobe (1-bit) -| | `xbus_cyc_o` | valid cycle (1-bit) -| | `xbus_ack_i` | acknowledge (1-bit) -| | `xbus_err_i` | bus error (1-bit) -| Configuration generics: | `XBUS_EN` | enable external bus interface when `true` -| | `XBUS_TIMEOUT` | number of clock cycles after which an unacknowledged external bus access will auto-terminate (0 = disabled) -| | `XBUS_PIPE_MODE` | when `false` (default): classic/standard Wishbone protocol; when `true`: pipelined Wishbone protocol -| | `XBUS_BIG_ENDIAN` | byte-order (Endianness) of external bus interface; `true`=BIG, `false`=little (default) -| | `XBUS_ASYNC_RX` | use registered RX path when `false` (default); use async/direct RX path when `true` -| | `XBUS_ASYNC_TX` | use registered TX path when `false` (default); use async/direct TX path when `true` +| Hardware source file(s): | neorv32_xbus.vhd | External bus gateway +| | neorv32_cache.vhd | External bus cache instance +| Software driver file(s): | none | _implicitly used_ +| Top entity port(s): | `xbus_adr_o` | address output (32-bit) +| | `xbus_dat_i` | data input (32-bit) +| | `xbus_dat_o` | data output (32-bit) +| | `xbus_we_o` | write enable (1-bit) +| | `xbus_sel_o` | byte enable (4-bit) +| | `xbus_stb_o` | strobe (1-bit) +| | `xbus_cyc_o` | valid cycle (1-bit) +| | `xbus_ack_i` | acknowledge (1-bit) +| | `xbus_err_i` | bus error (1-bit) +| Configuration generics: | `XBUS_EN` | enable external bus interface when `true` +| | `XBUS_TIMEOUT` | number of clock cycles after which an unacknowledged external bus access will auto-terminate (0 = disabled) +| | `XBUS_PIPE_MODE` | when `false` (default): classic/standard Wishbone protocol; when `true`: pipelined Wishbone protocol +| | `XBUS_BIG_ENDIAN` | byte-order (Endianness) of external bus interface; `true`=BIG, `false`=little (default) +| | `XBUS_ASYNC_RX` | use registered RX path when `false` (default); use async/direct RX path when `true` +| | `XBUS_ASYNC_TX` | use registered TX path when `false` (default); use async/direct TX path when `true` +| | `XBUS_CACHE_EN` | implement the external bus cache +| | `XBUS_CACHE_NUM_BLOCKS` | number of blocks ("lines"), has to be a power of two. +| | `XBUS_CACHE_BLOCK_SIZE` | size in bytes of each block, has to be a power of two. | CPU interrupts: | none | |======================= @@ -29,6 +33,7 @@ The external bus interface provides a Wishbone b4-compatible on-chip bus interface that is implemented if the `XBUS_EN` generic is `true`. This bus interface can be used to attach external memories, custom hardware accelerators, additional peripheral devices or all other kinds of IP blocks. +An optional cache module ("XCACHE") can be enabled to improve memory access latency. The external interface is **not** mapped to a specific address space. Instead, all CPU memory accesses that do not target a specific (and actually implemented) processor-internal address region (hence, accessing the "void"; @@ -95,14 +100,14 @@ SYSINFO module (see section <<_system_configuration_information_memory_sysinfo>> **Access Latency** -By default, the XBUS gateway introduces two additional latency cycles: processor-outgoing (`*_o`) and +By default, the XBUS gateway introduces two additional latency cycles since processor-outgoing (`*_o`) and processor-incoming (`*_i`) signals are fully registered. Thus, any access from the CPU to a processor-external devices -via Wishbone requires 2 additional clock cycles. This can ease timing closure when using large (combinatorial) Wishbone -interconnection networks. +via the XBUS interface requires 2 additional clock cycles. This can ease timing closure when using large (combinatorial) +processor-external interconnection networks. -Optionally, the latency of the XBUS gateway can be reduced by removing the input and output register stages. +Optionally, the latency of the XBUS gateway can be reduced by removing the input and/or output register stages. Enabling the `XBUS_ASYNC_RX` option will remove the input register stage; enabling `XBUS_ASYNC_TX` option will -remove the output register stages. Each enabled option reduces access latency by 1 cycle. +remove the output register stages. Note that using those "async" options might impact timing closure. .Output Gating [NOTE] @@ -110,3 +115,36 @@ All outgoing Wishbone signals use a "gating mechanism" so they only change if th progress. This can reduce dynamic switching activity in the external bus system and also simplifies simulation-based inspection of the Wishbone transactions. Note that this output gating is only available if the output register buffer is not disabled (`XBUS_ASYNC_TX` = `false`). + + +**External Bus Cache (X-CACHE)** + +[source,asciiart] +--------------------------------------- +Simplified cache architecture ("->" = direction of access requests): + + Direct Access +----------+ + /|-------------------------->| Register |------------------------->|\ + | | +----------+ | | +Core ---->| | | |----> XBUS + | | +--------------+ +--------------+ +-------------+ | | + \|--->| Host Arbiter |---->| Cache Memory |<----| Bus Arbiter |--->|/ + +--------------+ +--------------+ +-------------+ +--------------------------------------- + +The XBUS interface provides an optional cache module that can be used to buffer and improve processor-external accesses. +The cache uses a direct-mapped architecture that implements "write-allocate" and "write-back" strategies. + +The **write-allocate** strategy will fetch the entire referenced block from main memory when encountering +a cache write-miss. The **write-back** strategy will gather all writes locally inside the cache until the according +cache block is about to be replaced. In this case, the entire modified cache block is written back to main memory. + +The x-cache is enabled via the `XBUS_CACHE_EN` generic. The total size of the cache is split into the number of cache lines +or cache blocks (`XBUS_CACHE_NUM_BLOCKS` generic) and the line or block size in bytes (`XBUS_CACHE_BLOCK_SIZE` generic). + +The x-cache also provides "direct accesses" that bypass the cache. For example, this can be used to access processor-external +memory-mapped IO. All accesses that target the address range from `0xF0000000` to `0xFFFFFFFF` will always bypass the cache +(see section <<_address_space>>). Furthermore, load-reservate and store conditional <<_atomic_accesses>> will also always bypass the +cache **regardless of the accessed address**. + + diff --git a/docs/figures/neorv32_bus.png b/docs/figures/neorv32_bus.png index 85e831ffa..dfc907239 100644 Binary files a/docs/figures/neorv32_bus.png and b/docs/figures/neorv32_bus.png differ diff --git a/rtl/core/neorv32_package.vhd b/rtl/core/neorv32_package.vhd index 42c8caeb7..bc286e447 100644 --- a/rtl/core/neorv32_package.vhd +++ b/rtl/core/neorv32_package.vhd @@ -53,7 +53,7 @@ package neorv32_package is -- Architecture Constants ----------------------------------------------------------------- -- ------------------------------------------------------------------------------------------- - constant hw_version_c : std_ulogic_vector(31 downto 0) := x"01090604"; -- hardware version + constant hw_version_c : std_ulogic_vector(31 downto 0) := x"01090605"; -- hardware version constant archid_c : natural := 19; -- official RISC-V architecture ID constant XLEN : natural := 32; -- native data path width @@ -791,6 +791,9 @@ package neorv32_package is XBUS_BIG_ENDIAN : boolean := false; XBUS_ASYNC_RX : boolean := false; XBUS_ASYNC_TX : boolean := false; + XBUS_CACHE_EN : boolean := false; + XBUS_CACHE_NUM_BLOCKS : natural := 64; + XBUS_CACHE_BLOCK_SIZE : natural := 32; -- Execute in-place module (XIP) -- XIP_EN : boolean := false; XIP_CACHE_EN : boolean := false; diff --git a/rtl/core/neorv32_sysinfo.vhd b/rtl/core/neorv32_sysinfo.vhd index 3061ab683..cd26e4ad2 100644 --- a/rtl/core/neorv32_sysinfo.vhd +++ b/rtl/core/neorv32_sysinfo.vhd @@ -1,8 +1,8 @@ -- ################################################################################################# -- # << NEORV32 - System/Processor Configuration Information Memory (SYSINFO) >> # -- # ********************************************************************************************* # --- # This unit provides information regarding the NEORV32 processor system configuration - # --- # mostly derived from the top's configuration generics. # +-- # This unit provides information regarding the NEORV32 processor system configuration derived # +-- # mainly from the top's configuration generics. # -- # ********************************************************************************************* # -- # BSD 3-Clause License # -- # # @@ -43,33 +43,25 @@ use neorv32.neorv32_package.all; entity neorv32_sysinfo is generic ( - -- General -- CLOCK_FREQUENCY : natural; -- clock frequency of clk_i in Hz CLOCK_GATING_EN : boolean; -- enable clock gating when in sleep mode INT_BOOTLOADER_EN : boolean; -- boot configuration: true = boot explicit bootloader; false = boot from int/ext (I)MEM - -- Internal instruction memory -- MEM_INT_IMEM_EN : boolean; -- implement processor-internal instruction memory MEM_INT_IMEM_SIZE : natural; -- size of processor-internal instruction memory in bytes - -- Internal data memory -- MEM_INT_DMEM_EN : boolean; -- implement processor-internal data memory MEM_INT_DMEM_SIZE : natural; -- size of processor-internal data memory in bytes - -- Reservation Set Granularity -- AMO_RVS_GRANULARITY : natural; -- size in bytes, has to be a power of 2, min 4 - -- Instruction cache -- ICACHE_EN : boolean; -- implement instruction cache ICACHE_NUM_BLOCKS : natural; -- i-cache: number of blocks (min 2), has to be a power of 2 ICACHE_BLOCK_SIZE : natural; -- i-cache: block size in bytes (min 4), has to be a power of 2 ICACHE_ASSOCIATIVITY : natural; -- i-cache: associativity (min 1), has to be a power 2 - -- Data cache -- DCACHE_EN : boolean; -- implement data cache DCACHE_NUM_BLOCKS : natural; -- d-cache: number of blocks (min 2), has to be a power of 2 DCACHE_BLOCK_SIZE : natural; -- d-cache: block size in bytes (min 4), has to be a power of 2 - -- External bus interface -- XBUS_EN : boolean; -- implement external memory bus interface? XBUS_BIG_ENDIAN : boolean; -- byte order: true=big-endian, false=little-endian - -- On-chip debugger -- + XBUS_CACHE_EN : boolean; -- implement external bus cache ON_CHIP_DEBUGGER_EN : boolean; -- implement OCD? - -- Processor peripherals -- IO_GPIO_EN : boolean; -- implement general purpose IO port (GPIO)? IO_MTIME_EN : boolean; -- implement machine system timer (MTIME)? IO_UART0_EN : boolean; -- implement primary universal asynchronous receiver/transmitter (UART0)? @@ -103,6 +95,7 @@ architecture neorv32_sysinfo_rtl of neorv32_sysinfo is -- helpers -- constant int_imem_en_c : boolean := MEM_INT_IMEM_EN and boolean(MEM_INT_IMEM_SIZE > 0); constant int_dmem_en_c : boolean := MEM_INT_DMEM_EN and boolean(MEM_INT_DMEM_SIZE > 0); + constant xcache_en_c : boolean := XBUS_EN and XBUS_CACHE_EN; -- system information ROM -- type sysinfo_t is array (0 to 3) of std_ulogic_vector(31 downto 0); @@ -123,14 +116,14 @@ begin -- SYSINFO(2): SoC Configuration -- sysinfo(2)(00) <= '1' when INT_BOOTLOADER_EN else '0'; -- processor-internal bootloader implemented? - sysinfo(2)(01) <= '1' when XBUS_EN else '0'; -- external memory bus interface implemented? + sysinfo(2)(01) <= '1' when XBUS_EN else '0'; -- external bus interface implemented? sysinfo(2)(02) <= '1' when int_imem_en_c else '0'; -- processor-internal instruction memory implemented? sysinfo(2)(03) <= '1' when int_dmem_en_c else '0'; -- processor-internal data memory implemented? sysinfo(2)(04) <= '1' when XBUS_BIG_ENDIAN else '0'; -- is external memory bus interface using BIG-endian byte-order? sysinfo(2)(05) <= '1' when ICACHE_EN else '0'; -- processor-internal instruction cache implemented? sysinfo(2)(06) <= '1' when DCACHE_EN else '0'; -- processor-internal data cache implemented? sysinfo(2)(07) <= '1' when CLOCK_GATING_EN else '0'; -- enable clock gating when in sleep mode - sysinfo(2)(08) <= '0'; -- reserved + sysinfo(2)(08) <= '1' when xcache_en_c else '0'; -- external bus interface cache implemented? sysinfo(2)(09) <= '0'; -- reserved sysinfo(2)(10) <= '0'; -- reserved sysinfo(2)(11) <= '0'; -- reserved diff --git a/rtl/core/neorv32_top.vhd b/rtl/core/neorv32_top.vhd index 5ebe8f2fa..5c116a6e0 100644 --- a/rtl/core/neorv32_top.vhd +++ b/rtl/core/neorv32_top.vhd @@ -114,6 +114,9 @@ entity neorv32_top is XBUS_BIG_ENDIAN : boolean := false; -- byte order: true=big-endian, false=little-endian XBUS_ASYNC_RX : boolean := false; -- use register buffer for RX data when false XBUS_ASYNC_TX : boolean := false; -- use register buffer for TX data when false + XBUS_CACHE_EN : boolean := false; -- enable external bus cache (x-cache) + XBUS_CACHE_NUM_BLOCKS : natural := 64; -- x-cache: number of blocks (min 1), has to be a power of 2 + XBUS_CACHE_BLOCK_SIZE : natural := 32; -- x-cache: block size in bytes (min 4), has to be a power of 2 -- Execute in-place module (XIP) -- XIP_EN : boolean := false; -- implement execute in place module (XIP)? @@ -320,8 +323,8 @@ architecture neorv32_top_rtl of neorv32_top is signal main_rsp, main2_rsp, dma_rsp : bus_rsp_t; -- core complex (CPU + caches + DMA) -- bus: main sections -- - signal imem_req, dmem_req, xip_req, boot_req, io_req, xbus_req : bus_req_t; - signal imem_rsp, dmem_rsp, xip_rsp, boot_rsp, io_rsp, xbus_rsp : bus_rsp_t; + signal imem_req, dmem_req, xip_req, boot_req, io_req, xcache_req, xbus_req : bus_req_t; + signal imem_rsp, dmem_rsp, xip_rsp, boot_rsp, io_rsp, xcache_rsp, xbus_rsp : bus_rsp_t; -- bus: IO devices -- type io_devices_enum_t is ( @@ -364,33 +367,34 @@ begin -- show main SoC configuration -- assert false report "[NEORV32] Processor Configuration: " & - cond_sel_string_f(MEM_INT_IMEM_EN, "IMEM ", "") & - cond_sel_string_f(MEM_INT_DMEM_EN, "DMEM ", "") & - cond_sel_string_f(INT_BOOTLOADER_EN, "BOOTROM ", "") & - cond_sel_string_f(ICACHE_EN, "I-CACHE ", "") & - cond_sel_string_f(DCACHE_EN, "D-CACHE ", "") & - cond_sel_string_f(XBUS_EN, "XBUS ", "") & - cond_sel_string_f(io_gpio_en_c, "GPIO ", "") & - cond_sel_string_f(IO_MTIME_EN, "MTIME ", "") & - cond_sel_string_f(IO_UART0_EN, "UART0 ", "") & - cond_sel_string_f(IO_UART1_EN, "UART1 ", "") & - cond_sel_string_f(IO_SPI_EN, "SPI ", "") & - cond_sel_string_f(IO_SDI_EN, "SDI ", "") & - cond_sel_string_f(IO_TWI_EN, "TWI ", "") & - cond_sel_string_f(io_pwm_en_c, "PWM ", "") & - cond_sel_string_f(IO_WDT_EN, "WDT ", "") & - cond_sel_string_f(IO_TRNG_EN, "TRNG ", "") & - cond_sel_string_f(IO_CFS_EN, "CFS ", "") & - cond_sel_string_f(IO_NEOLED_EN, "NEOLED ", "") & - cond_sel_string_f(io_xirq_en_c, "XIRQ ", "") & - cond_sel_string_f(IO_GPTMR_EN, "GPTMR ", "") & - cond_sel_string_f(XIP_EN, "XIP ", "") & - cond_sel_string_f(IO_ONEWIRE_EN, "ONEWIRE ", "") & - cond_sel_string_f(IO_DMA_EN, "DMA ", "") & - cond_sel_string_f(IO_SLINK_EN, "SLINK ", "") & - cond_sel_string_f(IO_CRC_EN, "CRC ", "") & - cond_sel_string_f(true, "SYSINFO ", "") & -- always enabled - cond_sel_string_f(ON_CHIP_DEBUGGER_EN, "OCD ", "") & + cond_sel_string_f(MEM_INT_IMEM_EN, "IMEM ", "") & + cond_sel_string_f(MEM_INT_DMEM_EN, "DMEM ", "") & + cond_sel_string_f(INT_BOOTLOADER_EN, "BOOTROM ", "") & + cond_sel_string_f(ICACHE_EN, "I-CACHE ", "") & + cond_sel_string_f(DCACHE_EN, "D-CACHE ", "") & + cond_sel_string_f(XBUS_EN, "XBUS ", "") & + cond_sel_string_f(XBUS_EN and XBUS_CACHE_EN, "XCACHE ", "") & + cond_sel_string_f(io_gpio_en_c, "GPIO ", "") & + cond_sel_string_f(IO_MTIME_EN, "MTIME ", "") & + cond_sel_string_f(IO_UART0_EN, "UART0 ", "") & + cond_sel_string_f(IO_UART1_EN, "UART1 ", "") & + cond_sel_string_f(IO_SPI_EN, "SPI ", "") & + cond_sel_string_f(IO_SDI_EN, "SDI ", "") & + cond_sel_string_f(IO_TWI_EN, "TWI ", "") & + cond_sel_string_f(io_pwm_en_c, "PWM ", "") & + cond_sel_string_f(IO_WDT_EN, "WDT ", "") & + cond_sel_string_f(IO_TRNG_EN, "TRNG ", "") & + cond_sel_string_f(IO_CFS_EN, "CFS ", "") & + cond_sel_string_f(IO_NEOLED_EN, "NEOLED ", "") & + cond_sel_string_f(io_xirq_en_c, "XIRQ ", "") & + cond_sel_string_f(IO_GPTMR_EN, "GPTMR ", "") & + cond_sel_string_f(XIP_EN, "XIP ", "") & + cond_sel_string_f(IO_ONEWIRE_EN, "ONEWIRE ", "") & + cond_sel_string_f(IO_DMA_EN, "DMA ", "") & + cond_sel_string_f(IO_SLINK_EN, "SLINK ", "") & + cond_sel_string_f(IO_CRC_EN, "CRC ", "") & + cond_sel_string_f(true, "SYSINFO ", "") & -- always enabled + cond_sel_string_f(ON_CHIP_DEBUGGER_EN, "OCD ", "") & "" severity note; @@ -909,6 +913,8 @@ begin -- ------------------------------------------------------------------------------------------- neorv32_xbus_inst_true: if XBUS_EN generate + + -- bus gateway (Wishbone) -- neorv32_xbus_inst: entity neorv32.neorv32_xbus generic map ( BUS_TIMEOUT => XBUS_TIMEOUT, @@ -920,8 +926,8 @@ begin port map ( clk_i => clk_i, rstn_i => rstn_sys, - bus_req_i => xbus_req, - bus_rsp_o => xbus_rsp, + bus_req_i => xcache_req, + bus_rsp_o => xcache_rsp, -- xbus_adr_o => xbus_adr_o, xbus_dat_i => xbus_dat_i, @@ -933,7 +939,33 @@ begin xbus_ack_i => xbus_ack_i, xbus_err_i => xbus_err_i ); - end generate; + + -- external bus cache (XCACHE) -- + neorv32_xcache_inst_true: + if XBUS_CACHE_EN generate + neorv32_xcache_inst: entity neorv32.neorv32_cache + generic map ( + NUM_BLOCKS => XBUS_CACHE_NUM_BLOCKS, + BLOCK_SIZE => XBUS_CACHE_BLOCK_SIZE, + UC_BEGIN => uncached_begin_c(31 downto 28) + ) + port map ( + clk_i => clk_i, + rstn_i => rstn_sys, + host_req_i => xbus_req, + host_rsp_o => xbus_rsp, + bus_req_o => xcache_req, + bus_rsp_i => xcache_rsp + ); + end generate; + + neorv32_xcache_inst_false: + if not XBUS_CACHE_EN generate + xcache_req <= xbus_req; + xbus_rsp <= xcache_rsp; + end generate; + + end generate; -- /neorv32_xbus_inst_true neorv32_xbus_inst_false: if not XBUS_EN generate @@ -1529,29 +1561,22 @@ begin CLOCK_FREQUENCY => CLOCK_FREQUENCY, CLOCK_GATING_EN => CLOCK_GATING_EN, INT_BOOTLOADER_EN => INT_BOOTLOADER_EN, - -- Internal Instruction memory -- MEM_INT_IMEM_EN => MEM_INT_IMEM_EN, MEM_INT_IMEM_SIZE => imem_size_c, - -- Internal Data memory -- MEM_INT_DMEM_EN => MEM_INT_DMEM_EN, MEM_INT_DMEM_SIZE => dmem_size_c, - -- Reservation Set Granularity -- AMO_RVS_GRANULARITY => AMO_RVS_GRANULARITY, - -- Instruction cache -- ICACHE_EN => ICACHE_EN, ICACHE_NUM_BLOCKS => ICACHE_NUM_BLOCKS, ICACHE_BLOCK_SIZE => ICACHE_BLOCK_SIZE, ICACHE_ASSOCIATIVITY => ICACHE_ASSOCIATIVITY, - -- Data cache -- DCACHE_EN => DCACHE_EN, DCACHE_NUM_BLOCKS => DCACHE_NUM_BLOCKS, DCACHE_BLOCK_SIZE => DCACHE_BLOCK_SIZE, - -- External bus interface -- XBUS_EN => XBUS_EN, XBUS_BIG_ENDIAN => XBUS_BIG_ENDIAN, - -- On-Chip Debugger -- + XBUS_CACHE_EN => XBUS_CACHE_EN, ON_CHIP_DEBUGGER_EN => ON_CHIP_DEBUGGER_EN, - -- Processor peripherals -- IO_GPIO_EN => io_gpio_en_c, IO_MTIME_EN => IO_MTIME_EN, IO_UART0_EN => IO_UART0_EN, diff --git a/sim/neorv32_tb.vhd b/sim/neorv32_tb.vhd index eb69be84e..81550510f 100644 --- a/sim/neorv32_tb.vhd +++ b/sim/neorv32_tb.vhd @@ -269,6 +269,9 @@ begin XBUS_BIG_ENDIAN => false, -- byte order: true=big-endian, false=little-endian XBUS_ASYNC_RX => true, -- use register buffer for RX data when false XBUS_ASYNC_TX => true, -- use register buffer for TX data when false + XBUS_CACHE_EN => true, -- enable external bus cache (x-cache) + XBUS_CACHE_NUM_BLOCKS => 64, -- x-cache: number of blocks (min 1), has to be a power of 2 + XBUS_CACHE_BLOCK_SIZE => 32, -- x-cache: block size in bytes (min 4), has to be a power of 2 -- Execute in-place module (XIP) -- XIP_EN => true, -- implement execute in place module (XIP)? XIP_CACHE_EN => true, -- implement XIP cache? diff --git a/sim/simple/neorv32_tb.simple.vhd b/sim/simple/neorv32_tb.simple.vhd index a5efcb6bc..6f27e773f 100644 --- a/sim/simple/neorv32_tb.simple.vhd +++ b/sim/simple/neorv32_tb.simple.vhd @@ -217,6 +217,9 @@ begin XBUS_BIG_ENDIAN => false, -- byte order: true=big-endian, false=little-endian XBUS_ASYNC_RX => false, -- use register buffer for RX data when false XBUS_ASYNC_TX => false, -- use register buffer for TX data when false + XBUS_CACHE_EN => true, -- enable external bus cache (x-cache) + XBUS_CACHE_NUM_BLOCKS => 4, -- x-cache: number of blocks (min 1), has to be a power of 2 + XBUS_CACHE_BLOCK_SIZE => 32, -- x-cache: block size in bytes (min 4), has to be a power of 2 -- Execute in-place module (XIP) -- XIP_EN => true, -- implement execute in place module (XIP)? XIP_CACHE_EN => true, -- implement XIP cache? diff --git a/sw/lib/include/neorv32_sysinfo.h b/sw/lib/include/neorv32_sysinfo.h index 020fe184b..ff8c654bb 100644 --- a/sw/lib/include/neorv32_sysinfo.h +++ b/sw/lib/include/neorv32_sysinfo.h @@ -73,7 +73,8 @@ enum NEORV32_SYSINFO_SOC_enum { SYSINFO_SOC_XBUS_ENDIAN = 4, /**< SYSINFO_SOC (4) (r/-): External bus interface uses BIG-endian byte-order when 1 (via XBUS_BIG_ENDIAN generic) */ SYSINFO_SOC_ICACHE = 5, /**< SYSINFO_SOC (5) (r/-): Processor-internal instruction cache implemented when 1 (via ICACHE_EN generic) */ SYSINFO_SOC_DCACHE = 6, /**< SYSINFO_SOC (6) (r/-): Processor-internal instruction cache implemented when 1 (via DCACHE_EN generic) */ - SYSINFO_SOC_CLOCK_GATING = 7, /**< SYSINFO_SOC (7) (r/-): Clock gating enabled when 1 (via CLOCK_GATING_EN generic) */ + SYSINFO_SOC_CLOCK_GATING = 7, /**< SYSINFO_SOC (7) (r/-): Clock gating implemented when 1 (via CLOCK_GATING_EN generic) */ + SYSINFO_SOC_XBUS_CACHE = 8, /**< SYSINFO_SOC (8) (r/-): External bus cache implemented when 1 (via XBUS_CACHE_EN generic) */ SYSINFO_SOC_IO_CRC = 12, /**< SYSINFO_SOC (12) (r/-): Cyclic redundancy check unit implemented when 1 (via IO_CRC_EN generic) */ SYSINFO_SOC_IO_SLINK = 13, /**< SYSINFO_SOC (13) (r/-): Stream link interface implemented when 1 (via IO_SLINK_EN generic) */ diff --git a/sw/lib/source/neorv32_rte.c b/sw/lib/source/neorv32_rte.c index 59c0e5802..f60bd6384 100644 --- a/sw/lib/source/neorv32_rte.c +++ b/sw/lib/source/neorv32_rte.c @@ -683,12 +683,18 @@ void neorv32_rte_print_hw_config(void) { neorv32_uart0_printf("Ext. bus interface: "); tmp = NEORV32_SYSINFO->SOC; if (tmp & (1 << SYSINFO_SOC_XBUS)) { - neorv32_uart0_printf("Wishbone b4 "); + neorv32_uart0_printf("Wishbone-b4 "); if (tmp & (1 << SYSINFO_SOC_XBUS_ENDIAN)) { - neorv32_uart0_printf("big-endian\n"); + neorv32_uart0_printf("big-endian"); } else { - neorv32_uart0_printf("little-endian\n"); + neorv32_uart0_printf("little-endian"); + } + if (tmp & (1 << SYSINFO_SOC_XBUS_CACHE)) { + neorv32_uart0_printf(" x-cache\n"); + } + else { + neorv32_uart0_printf("\n"); } } else { diff --git a/sw/svd/neorv32.svd b/sw/svd/neorv32.svd index 3ae8892c9..8c3f9b3b0 100644 --- a/sw/svd/neorv32.svd +++ b/sw/svd/neorv32.svd @@ -1614,6 +1614,7 @@ SYSINFO_SOC_ICACHE[5:5]Processor-internal instruction cache implemented SYSINFO_SOC_DCACHE[6:6]Processor-internal data cache implemented SYSINFO_SOC_CLOCK_GATING[7:7]Clock gating implemented + SYSINFO_XBUS_CACHE_EN[8:8]External bus cache implemented SYSINFO_SOC_IO_CRC[12:12]Cyclic redundancy check unit implemented SYSINFO_SOC_IO_SLINK[13:13]Stream link interface implemented SYSINFO_SOC_IO_DMA[14:14]Direct memory access controller implemented