GitHub - lac-dcc/hydra: A benchmark game for guessing the hottest point of a program

Hydra is a collection of benchmarks and tools to test the ability of different techniques to predict the hottest spot in programs. Each benchmark consits of a single compilable C file that runs with one or more different inputs. We provide execution counts for all the edges of each program, a table that we call "the Ground Truth", plus scripts to extract and display these counts.

How to produce the ground truth

You can regenerate the ground truth (in JSON) running the script nisse_all.sh. The following dependencies are required:

Clang 17 or newer
A build of the Nisse profiler

Now, you need to adjust some parameters that suit your environment:

In the config.sh configuration file:
- LLVM_INSTALL_DIR (line 3): must point to your LLVM installation directory
- NISSE_SOURCE_DIR (line 4): must point to your NISSE source directory
- NISSE_BUILD_DIR (line 5): must point to your NISSE build directory
In the nisse_all.sh script:
- BASE_DIR (line 3): must point to your hydra (this repository) source directory

With these configurations correctly set, running the script nisse_all.sh must generate a file named jotaiMerlinResults2.json in the folder JSON Files. You can compare it with the jotaiMerlinResults.json using diff.

How to get the heuristics results

There are three heuristics (two of which are trivial) implemented to guess the hottest blocks, which are:

Random block: a random block from the program is considered the hottest block
Most nested block: a random most nested loop header from the program is considered the hottest block
LLVM-Predictor: the LLVM analyzes LoopInfo and BranchProbabilityInfo are used to predict the frequency of each basic block, considering the entry block executes only once. The block with the highest estimated frequency is considered the hottest block

In order to run them, you must have the following requirements:

CMake version 3.20 or newer
Clang version 17 or newer

Also, there are some parameters to adjust:

In the build.sh script:
- LLVM_INSTALL_DIR (line 3): must point to your LLVM installation directory
In the run.sh script:
- LLVM_INSTALL_DIR (line 3): must point to your LLVM installation directory
- BASE_DIR (line 4): must point to your hydra (this repository) source directory

With these configurations correctly set, you must run the scripts build.sh and run.sh in this order, and it must generate the JSONs jotaiRandomBlock2.json, jotaiNestedBlock2.json and jotaiPredictorBlock2.json in the folder JSON Files. Also, they can be compared with their respective original files using diff.

How to get the CSV table

With the jotaiMerlinResults.json, jotaiRandomBlock.json, jotaiNestedBlock.json and jotaiPredictorBlock.json files, you can generate a CSV file containing the detailed results by executing the genCsv.py python script.

How to pretty print a JSON file

The script print_jotai_json.py receives a path to the JSON file as parameter and returns a pretty print of this file. For more options, run python3 print_jotai_json.py --help.

The output format is as follows:

Each file in the benchmark begins with its name, followed by a line with the number of executions of this file.

Then, for each execution, the following structure appears:

A line indicating the number of edges, denoted as N
N subsequent lines, each containing information about an edge. The format of each line is:
- u -> v : count
- Here, u and v represent the origin and destination blocks of the edge, respectively
- count represents the number of times this edge is traversed during the execution.

There are also two other scripts in Python that are similar to print_jotai_json.py:

The script get_block_frequencies.py takes the JSON input and compute the block frequencies based on the edges frequencies. The output is very similar to the print_jotai_json.py one, but for each block the output is only u : count. The critical edges blocks are omitted in the output.
The script get_hottest_block.py not only compute the frequencies, but also compute what is the hottest blocks among every block in one execution. The output of one execution is a line, indicating the number N of hot blocks, followed by N lines, each one containing an ID of a hot block in that execution.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
Benchmark Scripts		Benchmark Scripts
Benchmark		Benchmark
CSV Files		CSV Files
JSON Files		JSON Files
JSON Scripts		JSON Scripts
assets/images		assets/images
include		include
lib		lib
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
build.sh		build.sh
genCsv.py		genCsv.py
getHBPJSON.py		getHBPJSON.py
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

How to produce the ground truth

How to get the heuristics results

How to get the CSV table

How to pretty print a JSON file

About

Releases

Packages

Contributors 2

Languages

License

lac-dcc/hydra

Folders and files

Latest commit

History

Repository files navigation

How to produce the ground truth

How to get the heuristics results

How to get the CSV table

How to pretty print a JSON file

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages