Merge main into explicit fsm compilation #2381

parthsarkar17 · 2024-12-18T21:13:04Z

No description provided.

…#2314) *Note: This PR is still a draft because I have some questions about the changes I made here that I wanted to ask about. Once those concerns are resolved I will turn this into a full PR.* ## This PR contains: - `fud2/scripts/profiler.rhai`: fud2 support for the profiler. I added a new state `flamegraph` (the `svg` file containing the flame graph) and a new operation called `profiler`, which takes a Calyx file and produces the profiled flame graph. - Updates to existing profiling scripts to read instrumented cell signals and allow toggling between profiling optimized and non-optimized Calyx programs. - Updated fud2 tests to reflect new state and operation. ## Usage First, clone https://github.com/brendangregg/FlameGraph and edit the `fud2` configuration file to specify the location of `flamegraph.pl`: ex) ``` [flamegraph] script = "/home/ayaka/projects/FlameGraph/flamegraph.pl" ``` To obtain a flame graph from a Calyx program, specify the output as a `svg` file. ex) ``` fud2 tests/correctness/while.futil -o while.svg -s sim.data=tests/correctness/while.futil ``` will produce the below flame graph. ![image](https://github.com/user-attachments/assets/b257ecb6-9f45-49b7-9555-f68e3212403e) To obtain a flame graph for the non-compiler-optimized version of the Calyx program, add `-s passes=no-opt`: ex) ``` fud2 tests/correctness/while.futil -o flame.svg -s sim.data=tests/correctness/while.futil.data -s passes=no-opt ```

…ogging (#2319) Sorry for the larger PR here, a bunch of stuff ended up getting all tangled together. This PR adds: - a check for undefined guards after convergence. This is currently disabled by default as the changes to `@control` mean that this will basically never be relevant for normal programs. - redoes internal error handling to make proper error messages more enforceable by the type system - Adds a configuration struct for runtime checks/behaviors - Adds optional debug logging via `--debug-logging` which prints out the assignments that fire and the results they propagate alongside the implicit zero assignments. Might be useful for some people under some circumstances? - Adjust the `@control` port process to match that used by the compiler. Consequently for most programs we simulate, every port is `@control`. This means some error cases are no longer errors and also reintroduces transient conflicts on some programs. - As a spot fix to the transient conflicts, when detected we will now check if the guard for the first assignment still evaluates to true and if not we take the later assignment and continue onward, otherwise an error is raised. In all likelihood this won't mitigate every instance of the transient conflict problem but seems to address it for our current programs.

This reduces the size of BitVecValue from 32 bytes to 24 bytes.

Closes #2321

…#2323) A quick patch for the `fud2` flow which makes it possible to run without supplying a `sim.data` value which would otherwise block things. This hopefully makes testing small programs more straightforward as it no longer requires generating an empty data file or using direct invocations of the tools.

Extract floating-point support for `fud` from #1928

Add a `calyx.lib_path` argument for the `fud2` stage.

This PR removes ops from `lib.rs` which are duplicated in Rhai scripts.

Changes from #2266 since author has not responded. Fixes #2253.

Removes all the old btor code that is no longer used.

Quick set of renaming to reduce the confusion with the `flags` stuff which should make it clear where things belong

@EclecticGriffin

Adding `jq` stage to `fud2` but I've been having trouble getting the stage to print out the value from the command. It seems like the `jq ... > $out` call gobbles up the output. Not sure if I'm making a mistake. Any thoughts @EclecticGriffin @ekiwi or @sampsyo?

@ekiwi

Fixes #2336. Currently running into the same problem as #2335 (comment) @ekiwi how did you fix it on the Rust side?

Moving all tests in the root `runt.toml` to use `fud2`.

Sorry, I overlooked Rachit's comment on the previous PR.

This PR adds a documentation page for the `calyx-pass-explorer` tool, serving as a gentle introduction to its usage as well as an overview of its code. - I've added `mdbook-callouts` as a way to get Obsidian-style quote blocks. Specifically, I'm using this for the `[!TIP]` environment. I can remove this if we don't want another dependency. - I did some unrelated things (to `docs/contributors.md`, `docs/github.md`, and `docs/intro.md`) that I can move into another PR if that would be better.

This allows us to remove most uses of `jq` in the tests.

I did some silly hacking late last week that might get us marginally closer to unifying all the data marshaling under a single tool. This adds the ability to specify a `--to dat` (or one of the aliases) and an output directory which will generate the hex encoded files that verilator/icarus expect. There are some minor differences with the dat files generated by the python flow. Mainly, the python flow truncates leading zeroes in the encoding while I've elected to retain them in the interest of keeping things simple. Python generates: ``` 4B 53 21 5D 1E 5E 2B B 3C 60 ``` while the data-converter generates ``` 0000004B 00000053 00000021 0000005D 0000001E 0000005E 0000002B 0000000B 0000003C 00000060 ``` This also adds the ability to deserialize this style of data dump but is slightly brittle at the moment since it expect the following: - the data header is exactly that used by the tool and is cbor encoded in a file named `header`. The python currently json encodes things in a file named `shape` - the input data includes all leading zeroes Both these assumptions can probably be relaxed in the future in the interest of robustness

Creates a new `BaseSimulator` object that can be used independently of `Simulator`. Also, things are cloneable now. The rationale is we can now write a model checker using cider by cloning cider contexts at forks.

Configured various files and edited the parser to print out nodes in the format we talked about. Biggest function implemented was string_path in **program_counter.rs**

I spent some more time hacking on this instead of spending my time in a more productive way. This does the following: - switch the `dat` deserializing from custom matching stuff to a proper `nom` parser that accounts for comments and leading `0x` tags. - `dat` files can now parse values with leading zeroes truncated (though we continue to generate `dat` files with the leading zeroes included) - Output `dat` files will be generated by default as `MEMNAME.dat` though this can be customized with `-e` flag. I.e. `-e out` will generate `MEMNAME.out` - Similarly, when reading in a `dat` directory, the tool will look for `MEMNAME.dat` which can be retargeted via `-e` flag. - The tool will also infer the `--to json` target when given a directory as input

Closes #2344 since I think it's a relative paths issue.

Actually closes #2344; see #2361 for the failed previous attempt.

Created a parser to process the path we've described such as: ".-0-1-b"

What is says on the tin (as discussed on this [slack thread](https://cucapra.slack.com/archives/C06CV424G94/p1727223025447109)) Due to a new [release](https://releases.rs/docs/1.83.0/) for rust, CI catches a few errors (unrelated to this PR). For now, we simply pin to an older version: - remove toolchain argument for `setup-rust-toolchain@v1` in `rust.yml` - add `rust-toolchain.toml`

Adding functions to check if values are special IEEE values (Nan, infinity, denormalized)

Finally, rename the cider library to be `cider` instead of `interp`.

…#2377) A quick touch up of the `GlobalPositionTable` to avoid UB without getting too fancy. Main changes are use of `boxcar::Vec` instead of the standard library Vec, so that we can now have lock-free concurrent appends and accesses. This fits our use-case since we never de-allocate files & positions and frees us from all the issues associated with using locks. Since we're not using locks, we also don't need any unsafe to extend lifetimes, and since `boxcar::Vec` doesn't reallocate stored elements, we don't need to box the stored stuff. I also use `LazyLock` instead of `lazy_static!` which is easier to deal with. The end result is a global position table that can be accessed through associated functions on the `GlobalPositionTable` type, rather than methods on an instance. I also changed the `String`s stored by `File` to `Box<str>`, because we don't ever mutate them. --- In the process of this, I ran face first in CI issues. Turns out that because some of the tests run in the docker container, they aren't actually building using the version of rust in `rust-toolchain.toml` and instead were using the version of rust from the container (1.76). This broke things since `LazyLock` was only stabilized in 1.80. I had a real fight with the CI because it turns out that using the standard actions in the docker container is not a straightforward as one would like. I'll skip a recount of the entire ordeal---peruse the commits to witness my pain---but I resolved things and made these tests consistent with the rest of CI by specifically pulling the toolchain file before running the rust install action. The end result is that even the docker container tests do respect the indicated rust version, and we should be able to bump the version of rust used by _EVERYTHING_ (if this isn't true I may scream) by updating the file.

The WIP [PR that tries to get an `yxi` powered end to end xilinx tools workflow up and running](#2267) ballooned into a beast and has gotten pretty stale. This is me trying to break things down into parts and slowly get everything merged. This changes the `*axi_generator.py` files to use underscores, which is needed for proper importing afaict. Also snuck in formatting changes to `dynamic_axi_generator.py`. These are pretty minor changes so going to merge when tests pass.

The existing instructions did not work to add `fud2` to my path (locally on macOS). FWICT `cargo install --path fud2` did. Maybe we also want to get rid of the previous instructions? Not sure if they are still valid.

Another part of #2267 that makes sense to live on it's own. Sorry for so many small PRs! But whittling things down like this is helping me better remember where everything stands

AMD decided to bork the old links to a bunch of useful docs, this updates that. Just comment changes so going to auto merge

Tiny patch to replace `OnceLock` since `LazyLock` works in this case and is slightly simpler

Turns out there was no reset input port to `comb_mem_d4` in `comb.futil`! 🤣

This PR contains a series of minor changes I made over the past couple of weeks. I made some small changes to non-profiler parts of the repo: - Adding `*.svg` and `*.folded` to the gitignore - Adding a profiler-specific pass alias (`-p profiler`) - Using `-p profiler` in fud2 so there's no inconsistency in what passes we use. In profiler-specific code, I fixed a bug where the flame graph recorded one extra cycle for `main`, added colors to categorize nodes in the aggregate tree picture (between cells, groups, and primitives), and have the profiler emit a timeline JSON involving all cells by default.

Closes #2288 I've checked that this grabs the correct values. Say our `main_utilization_placed.rpt` file had the following BRAM table ![nine](https://github.com/user-attachments/assets/9f84ee86-deb2-427e-9c27-0f35db4ed39c) (not a real utilization file, I changed the highlighted entry from 0 to 9 to make sure I'm grabbing the correct value) Running `fud e --from synth-files --to resource-estimate report` yields the correct BRAM count of `9`: ``` { "lut": 169, "dsp": 0, "brams": 9, "registers": 60, "muxes": 30, "clb_registers": 210, "carry8": 10, "f7_muxes": 0, "f8_muxes": 0, "f9_muxes": 0, "clb": 389, "meet_timing": 1, "worst_slack": 4.758, "period": 7.0, "frequency": 142.857, "uram": -1, "cell_lut1": 5, "cell_lut2": 110, "cell_lut3": 38, "cell_lut4": 17, "cell_lut5": 18, "cell_lut6": 49, "cell_fdre": 210 } ```

Closing #2304, we can generate data files for queues directly from packet captures using [`parse_pcap.py`](https://github.com/calyxir/calyx/blob/80b0ef49fc5ce0b7277bffd12a7596e588b72d53/frontends/queues/evaluation/parse_pcap.py)! Changes: - add `parse_pcap.py`, a PCAP parser for generating data files - move synthesis, parsing, and plotting scripts into `evaluation/` - add `pcap_sim.py`, a harness similar to `queue_call` for iterating through commands in our data file - add `insert_tuple_flow_inference` to `flow_inference.py`, flow inference to be used with `pcap_sim.py` - remove `generate` functions from `binheap/round_robin.py`, `binheap/strict.py`, and `strict_or_rr.py` - cleaner seperation of queue components and tests - tweak round robin and strict tests to take command line argument `sim-pcap` - if `sim-pcap` is True, use `pcap_sim.py` instead of `queue_call.py` - add `calyx_py-to-calyx.rhai` to `fud2/scripts/` - creates a state for CalyxPy and an operation from CalyxPy to Calyx - add associated `fud2` tests to pass CI - add `plot_pcap_sim.py`, a script for generating Gantt charts from output memories ## Example Say you want a data file of the format used by our [Testing Harness](https://docs.calyxir.org/frontends/queues.html#shared-testing-harness), with timing based on packets 10 through 20 of `kachow.pcap`. Simply run ``` python3 parse_pcap.py kachow.pcap kachow.data --start 10 --end 20 ``` This produces two outputs - `kachow.data`, the data file - various statistics about the data file and PCAP: namely, `num_cmds`, the size of the `commands` memory in `kachow.data` We can run `kachow.data` using `sim_pcap.py` on any of our round robin or strict queues: 1. pick your favorite round robin or strict test in `tests/` 2. generate Calyx with `python3 <num_cmds> --sim-pcap <test_name>.py` 3. run the produced Calyx with data file `kachow.data` and your simulator of choice For example, we can combining steps 2 and 3 into a single `fud2` command ``` fud2 rr_2flow_test.py -o rr_2flow_test.json \ --through verilator \ -s sim.data=kachow.data \ -s py.args="<num_cmds> --sim-pcap" \ ``` which produces the output memories `rr_2flow_test.json`. We can visualize these memories with graphs, in the style of _Formal Abstractions_, by feeding them to `plot_pcap_sim.py`: ``` python3 plot_pcap_sim.py rr_2flow_test.json ``` This generates the Gantt chart `rr_2flow_test.png`. --------- Co-authored-by: Anshuman Mohan <10830208+anshumanmohan@users.noreply.github.com>

rachitnigam · 2025-01-24T03:30:49Z

@parthsarkar17 did you mean to open this? In general, when you need to merge main into your working branch, you can just do:

# from your branch
git fetch --all
git checkout main && git pull
git checkout - # '-' goes back to the last branch
git merge main

You don't have to open PRs for this; usually PRs are opened when you're merging into main.

@evanmwilliams

This patch is the first half of #1928 We use Morty: https://github.com/pulp-platform/morty to pickle Verilog/System Verilog files needed as libraries. Acknowledgement to @evanmwilliams for the original implementation.

@ayakayorihiro

An initial redesign of the metadata table. This PR ended up a little chunky, so to break things down it does the following: - Introduces a `SetAttribute` to Calyx which stores multiple unique numbers rather than only one. The syntax is `@Attribute_name{ n1, n2, n3 }` and `"attribute_name"={ n1, n2, n3 }` depending on the context. - Move `pos` from `Attribute` to `SetAttribute`. If a `pos` attribute is written with the standard syntax, the compiler will emit a warning and ignore the attribute. E.g. `@pos(1)` is incorrect and will be ignored since it should be `@pos{1}`. - Also adds an `Unknown` set attribute, which works the same as its corresponding variant in the standard attributes - Adds a new `MetadataTable` construct which is parsed in a `fileinfo #{ .. }#` block and made available in the `Workspace` and `Context` structures as an optional field. - At present, the table maps file identifiers to paths and positions to pairs of files and line numbers. We will likely want to extend this in the future. - The actual syntax of the table is a very basic placeholder which we can iterate on later as needed - There is currently a limited metadata lookup construct which will read the file contents into memory and return the given lines. It's pretty brittle right now - We may in the future want a special signifier for the current file / input stream as calyx->calyx mappings will want this. Though perhaps there is a better way of approaching this - Also moves the cider index implementation into a separate crate `cider-idx` as we were initially going to use that with the new table, but we moved away from that approach for the moment, so these changes are largely irrelevant. Didn't want to do git magic since it was already pushed and the branch is being used by @ayakayorihiro.

Another PR whittling down the changes present in #2267. This PR changes the toplevel clock signal from `clk` to `ap_clk`. This should allow us to reuse the existing [`gen_xo.tcl`](https://github.com/calyxir/calyx/blob/main/fud2/rsrc/gen_xo.tcl#L34-L37) and maintain backwards compatibility with the old verilog-axi-wrapper, without requiring a distinction between the two in the `gen_xo.tcl` file. Also fixes a bsd vs gnu `sed` syntax error. The new command should work on both. (bsd sed require terminating commands to end with a `;`, gnu does not) I'll also note that this PR touches a bunch of runt cocotb tests in `tests/axi`. I'm aware that there is a desire to get rid of large snapshots as they tend to get ignored. For now this setup is the best thing I have to make sure breaking changes aren't introduced to either axi-wrapper, so choosing to update them for now.

A quick and dirty integration of the newer sourceinfo into Cider such that requesting the program counter will display the source filename & line when available, similar to how it used to work with the embedded metadata lines.

A quick PR that modifies the `fud2` execution of the debugger and some of the debugger internals to make it possible to run tests on its behavior. Using this, it adds a quick test for the display of the new source info.

Older targets, such as the zynq 7020 have different table names for LUT counts, namely that they group LUTs by slice rather than CLB. Usually, the extraction script errors here as it is unable to find the right tables and rows. The `try catch` method is sort of janky, maybe someone has a better idea? Here is an example on the zynq: ``` 1. Slice Logic -------------- +----------------------------+-------+-------+-----------+-------+ | Site Type | Used | Fixed | Available | Util% | +----------------------------+-------+-------+-----------+-------+ | Slice LUTs | 48018 | 0 | 53200 | 90.26 | | LUT as Logic | 45075 | 0 | 53200 | 84.73 | | LUT as Memory | 2943 | 0 | 17400 | 16.91 | | LUT as Distributed RAM | 0 | 0 | | | | LUT as Shift Register | 2943 | 0 | | | | Slice Registers | 70984 | 0 | 106400 | 66.71 | | Register as Flip Flop | 70984 | 0 | 106400 | 66.71 | | Register as Latch | 0 | 0 | 106400 | 0.00 | | F7 Muxes | 0 | 0 | 26600 | 0.00 | | F8 Muxes | 0 | 0 | 13300 | 0.00 | +----------------------------+-------+-------+-----------+-------+ ``` --------- Co-authored-by: Rachit Nigam <rachit.nigam12@gmail.com>

rachitnigam · 2025-02-03T02:48:05Z

@parthsarkar17 do you need this PR still? It's getting stale.

@EclecticGriffin

The main page for the repo continues to become unmanageably long. This is an attempt to reorg things to make the `calyx` stuff be under an umbrella. I suspect `cider` stuff (`cider-dap` and `cider-idx`) should be similarly organized but didn't want to move without @EclecticGriffin's permission. Fixes #1941

parthsarkar17 · 2025-02-03T15:45:49Z

Agh sorry Rachit, thank you for closing this.

ayakayorihiro and others added 30 commits October 29, 2024 18:42

bump baa to 0.14.0 (#2320)

2f1af6f

This reduces the size of BitVecValue from 32 bytes to 24 bytes.

Add wait and break to reserved names (#2322)

b084974

Closes #2321

Add bypass_reg and skid_buffer primitives to core (#2324)

c455733

Floating-point support in fud (#2318)

720e287

Extract floating-point support for `fud` from #1928

Separate flag for specifying lib path in fud2 (#2325)

d1f43c6

Add a `calyx.lib_path` argument for the `fud2` stage.

add calyx.lib_path argument for fud stage (#2326)

8294be8

Remove duplicate ops from lib.rs. (#2272)

3687b07

This PR removes ops from `lib.rs` which are duplicated in Rhai scripts.

Remove use of always_comb from binary_operators.sv (#2330)

95fb141

Changes from #2266 since author has not responded. Fixes #2253.

remove btor code (#2332)

48f5002

Removes all the old btor code that is no longer used.

[Cider | Fud2] Rename fud2 stuff to reduce confusion (#2334)

d87b1a2

Quick set of renaming to reduce the confusion with the `flags` stuff which should make it clear where things belong

support quiet mode execution for fud2 (#2339)

351af02

Disable rounding for float conversion (#2337)

f44cf8f

Fixes #2336. Currently running into the same problem as #2335 (comment) @ekiwi how did you fix it on the Rust side?

fud2 Migrate main test suite (#2340)

1e9e3b8

Moving all tests in the root `runt.toml` to use `fud2`.

add floating point support to cider (#2335)

42880f1

cider: use jq stage (#2342)

1acc612

Sorry, I overlooked Rachit's comment on the previous PR.

data-converted: always sort keys in output JSON (#2343)

fd43d16

This allows us to remove most uses of `jq` in the tests.

[cider2] Make minimally-sized clonable simulator (#2345)

33255f9

Creates a new `BaseSimulator` object that can be used independently of `Simulator`. Also, things are cloneable now. The rationale is we can now write a model checker using cider by cloning cider contexts at forks.

Seq (#2346)

dd55f4a

Configured various files and edited the parser to print out nodes in the format we talked about. Biggest function implemented was string_path in **program_counter.rs**

[docs] Fix images in docs for calyx-pass-explorer (#2361)

097f1df

Closes #2344 since I think it's a relative paths issue.

[docs] Fix images in docs for calyx-pass-explorer (#2362)

cc24c34

Actually closes #2344; see #2361 for the failed previous attempt.

Parser (#2353)

154da35

Created a parser to process the path we've described such as: ".-0-1-b"

Ieee special values (#2367)

23d78cd

Adding functions to check if values are special IEEE values (Nan, infinity, denormalized)

EclecticGriffin and others added 13 commits December 13, 2024 18:45

[Cider] rename interp to cider (#2376)

727912a

Finally, rename the cider library to be `cider` instead of `interp`.

quick doc fixes

1505e5d

upgrade baa, fst-writer and fraction dependencies (#2382)

e0b02ed

Add alternative fud2 installation instructions (#2383)

23d466e

The existing instructions did not work to add `fud2` to my path (locally on macOS). FWICT `cargo install --path fud2` did. Maybe we also want to get rid of the previous instructions? Not sure if they are still valid.

Remove redundant output_with_attributes from python builder (#2385)

7e05ef9

Another part of #2267 that makes sense to live on it's own. Sorry for so many small PRs! But whittling things down like this is helping me better remember where everything stands

Update xilinx doc links (#2387)

687fbb3

AMD decided to bork the old links to a bunch of useful docs, this updates that. Just comment changes so going to auto merge

[Cider] replace OnceLock with LazyLock (#2390)

cbf9be0

Tiny patch to replace `OnceLock` since `LazyLock` works in this case and is slightly simpler

Add reset to comb_mem_d4 (#2389)

5f5bb70

Turns out there was no reset input port to `comb_mem_d4` in `comb.futil`! 🤣

rachitnigam added the S: Stale label Jan 24, 2025

jiahanxie353 and others added 2 commits January 24, 2025 10:47

github-actions bot removed the S: Stale label Jan 25, 2025

nathanielnrn and others added 4 commits January 26, 2025 18:29

[Cider] Add test for debugger use of source info (#2401)

9b12564

A quick PR that modifies the `fud2` execution of the debugger and some of the debugger internals to make it possible to run tests on its behavior. Using this, it adds a quick test for the display of the new source info.

rachitnigam added the S: Stale label Feb 3, 2025

Update cargo packages (#2408)

03d4191

github-actions bot removed the S: Stale label Feb 3, 2025

rachitnigam added 2 commits February 3, 2025 10:23

Remove data-conversion folder (#2406)

528aebf

rachitnigam closed this Feb 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge main into explicit fsm compilation #2381

Merge main into explicit fsm compilation #2381

parthsarkar17 commented Dec 18, 2024

rachitnigam commented Jan 24, 2025

rachitnigam commented Feb 3, 2025

parthsarkar17 commented Feb 3, 2025

Merge main into explicit fsm compilation #2381

Merge main into explicit fsm compilation #2381

Conversation

parthsarkar17 commented Dec 18, 2024

rachitnigam commented Jan 24, 2025

rachitnigam commented Feb 3, 2025

parthsarkar17 commented Feb 3, 2025