Skip to content

[Pack][Timing] Abstracted How Timing is Used in the Packer #2965

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Conversation

AlexandreSinger
Copy link
Contributor

Timing was intermixed into the packer. It appears as though the code originally was designed to recalculate the timing information every so often in the packer, but the idea was abandoned. This left timing code in disperse locations around the Packer and the timing was being recomputed every time clustering was restarted which was unecessary.

Collecting all of the timing information from the Packer into a single object called PreClusterTimingManager which abstracts all of the timing info in the Packer.

The ultimate goal is to bring this Manager class into the AP flow to be used together with the Global Placer. By sharing this manager class, the AP flow may be able to update the timing info with flat placement information to make the timing more accurate.

@github-actions github-actions bot added VPR VPR FPGA Placement & Routing Tool lang-cpp C/C++ code labels Apr 6, 2025
@AlexandreSinger AlexandreSinger force-pushed the feature-pack-timing-manager branch from 7a1bfb4 to 509d529 Compare April 6, 2025 14:04
@vaughnbetz
Copy link
Contributor

Should do a QoR run to check it is all good (run Titan and/or VTR and confirm equivalent QoR or no change). Linking a spreadsheet would be best.

class PreClusterTimingManager {
public:
/**
* @brief Constructor for the manager class. Builds the timing graph.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this really build the timing graph? Or is it already built?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say what this constructor does: it performs a setup timing analysis with a pre-clustering delay model, and caches some criticality information for later queries. The delay model used for the timing analysis is in ?? and uses the primitive delays specified in the architecture file and a simple estimate of routing (a typical general routing delay based on the wire delays found in the architecture, and more specific delays for direct connections like carry chains whose use we already know from the pre-packing).

(Confirm what I said is true).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the comment above could go in the delay calculator -- probably best to split it.

Copy link
Contributor Author

@AlexandreSinger AlexandreSinger Apr 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vaughnbetz After reviewing the code, I am still not sure if the carry-chains are being used in the timing analyzer directly. When computing the max edge delay, the code performs the following:
image
If the edge between two atoms in a carry chain are not labelled at combinational, then the inter_cluster_net_delay will be used instead.

The code that labels these edges is (I think) created by the TimingGraphBuilder, which builds the timing graph as part of initializing VPR (prior to packing). Thus, it could not know that the edge between two atoms within the same molecules are combinational (also, based on the name, it wouldn't make sense for them to be labelled combinational anyways). This implies to me that the delays within molecules are not being observed when they should. Someone should check the timing reports to ensure that this is correct.

If this is true, a simple fix to this would be to check if the edge exists within a molecule, and if so use the delay of the edge within the molecule instead of the interconnect delay (or at least the worst case of the molecule. This is because, if an atom was packed into a molecule, it must be packed together (as far as I am aware).

Should I open an issue on this? It's not directly in my path, and could be a good thing for someone else to look into.

@AlexandreSinger AlexandreSinger force-pushed the feature-pack-timing-manager branch from 509d529 to 9a5af54 Compare April 10, 2025 01:46
Timing was intermixed into the packer. It appears as though the code
originally was designed to recalculate the timing information every so
often in the packer, but the idea was abandoned. This left timing code
in disperse locations around the Packer and the timing was being
recomputed every time clustering was restarted which was unecessary.

Collecting all of the timing information from the Packer into a single
object called PreClusterTimingManager which abstracts all of the timing
info in the Packer.

The ultimate goal is to bring this Manager class into the AP flow to be
used together with the Global Placer. By sharing this manager class, the
AP flow may be able to update the timing info with flat placement
information to make the timing more accurate.
@AlexandreSinger AlexandreSinger force-pushed the feature-pack-timing-manager branch from 9a5af54 to 150f634 Compare April 10, 2025 02:40
@AlexandreSinger
Copy link
Contributor Author

QoR results on Titan:

  baseline pk_timing_mngr
vtr_flow_elapsed_time 1 0.997409372882633
num_LAB 1 1
num_DSP 1 1
num_M9K 1 1
num_M144K 1 1
max_vpr_mem 1 1.00241342168808
num_pre_packed_blocks 1 1
num_post_packed_blocks 1 1
device_grid_tiles 1 1
pack_time 1 1.00468192080581
placed_wirelength_est 1 1
place_time 1 1.00447348452275
placed_CPD_est 1 1
routed_wirelength 1 1
critical_path_delay 1 1
geomean_nonvirtual_intradomain_critical_path_delay 1 1
crit_path_route_time 1 1.00336920535485

There was absolutely no change in quality, and the change in memory / runtime appear to be just noise from the machine.

Full script:
comparison_output.xlsx

Looks good to merge.

@AlexandreSinger AlexandreSinger merged commit 433a917 into verilog-to-routing:master Apr 10, 2025
36 checks passed
@AlexandreSinger AlexandreSinger deleted the feature-pack-timing-manager branch April 10, 2025 18:54
@vaughnb-cerebras
Copy link

Thanks! The hard data makes me feel more at ease :).

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
lang-cpp C/C++ code VPR VPR FPGA Placement & Routing Tool
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants