Hydra is a collection of benchmarks and tools to test the ability of different techniques to predict the hottest spot in programs. Each benchmark consits of a single compilable C file that runs with one or more different inputs. We provide execution counts for all the edges of each program, a table that we call "the Ground Truth", plus scripts to extract and display these counts.
You can regenerate the ground truth (in JSON) running the script nisse_all.sh. The following dependencies are required:
- Clang 17 or newer
- A build of the Nisse profiler
Now, you need to adjust some parameters that suit your environment:
- In the config.sh configuration file:
LLVM_INSTALL_DIR
(line 3): must point to your LLVM installation directoryNISSE_SOURCE_DIR
(line 4): must point to your NISSE source directoryNISSE_BUILD_DIR
(line 5): must point to your NISSE build directory
- In the nisse_all.sh script:
BASE_DIR
(line 3): must point to your hydra (this repository) source directory
With these configurations correctly set, running the script nisse_all.sh
must generate a file named jotaiMerlinResults2.json
in the folder JSON Files
. You can compare it with the jotaiMerlinResults.json
using diff
.
There are three heuristics (two of which are trivial) implemented to guess the hottest blocks, which are:
- Random block: a random block from the program is considered the hottest block
- Most nested block: a random most nested loop header from the program is considered the hottest block
- LLVM-Predictor: the LLVM analyzes
LoopInfo
andBranchProbabilityInfo
are used to predict the frequency of each basic block, considering the entry block executes only once. The block with the highest estimated frequency is considered the hottest block
In order to run them, you must have the following requirements:
- CMake version 3.20 or newer
- Clang version 17 or newer
Also, there are some parameters to adjust:
- In the build.sh script:
LLVM_INSTALL_DIR
(line 3): must point to your LLVM installation directory
- In the run.sh script:
LLVM_INSTALL_DIR
(line 3): must point to your LLVM installation directoryBASE_DIR
(line 4): must point to your hydra (this repository) source directory
With these configurations correctly set, you must run the scripts build.sh
and run.sh
in this order, and it must generate the JSONs jotaiRandomBlock2.json
, jotaiNestedBlock2.json
and jotaiPredictorBlock2.json
in the folder JSON Files
. Also, they can be compared with their respective original files using diff
.
With the jotaiMerlinResults.json
, jotaiRandomBlock.json
, jotaiNestedBlock.json
and jotaiPredictorBlock.json
files, you can generate a CSV file containing the detailed results by executing the genCsv.py
python script.
The script print_jotai_json.py
receives a path to the JSON file as parameter and returns a pretty print of this file. For more options, run python3 print_jotai_json.py --help
.
The output format is as follows:
Each file in the benchmark begins with its name, followed by a line with the number of executions of this file.
Then, for each execution, the following structure appears:
- A line indicating the number of edges, denoted as
N
N
subsequent lines, each containing information about an edge. The format of each line is:u
->v
:count
- Here,
u
andv
represent the origin and destination blocks of the edge, respectively count
represents the number of times this edge is traversed during the execution.
There are also two other scripts in Python that are similar to print_jotai_json.py
:
- The script
get_block_frequencies.py
takes the JSON input and compute the block frequencies based on the edges frequencies. The output is very similar to theprint_jotai_json.py
one, but for each block the output is onlyu
:count
. The critical edges blocks are omitted in the output. - The script
get_hottest_block.py
not only compute the frequencies, but also compute what is the hottest blocks among every block in one execution. The output of one execution is a line, indicating the numberN
of hot blocks, followed byN
lines, each one containing an ID of a hot block in that execution.