Kernel cache #336

vosen · 2025-02-20T01:21:08Z

Expected outcome:
After PTX module have been compiled once, it gets cached on disk in an SQLite database. Before kernel gets compiled we look in cache to check if the kernel has been previously compiled

Comments:

Key should be hash (BLAKE3) of module text + compiler version + ZLUDA version + device (gfxXXXX) + flags (debug/release, windows/linux, compiler switches)
We will want to eventually support mechanism similar to CUDA_CACHE_MAXSIZE. This does not have to be implemented yet, but db should at least contain necessary information for evicting cache: time of last use for each entry and total size of all the kernels
You can get compile version from comgr by running a preprocessor on a file containing preprocessor directive that resolves to full clang version. It contains version and ROCm LLVM hash
ZLUDA version should be current git commit hash. There are several crates for it, last time I used vergen

The text was updated successfully, but these errors were encountered:

vosen added help wanted Extra attention is needed planned This is part of the roadmap labels Feb 20, 2025

vosen added this to the llm.c (no FlashAttention) milestone Feb 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kernel cache #336

Kernel cache #336

vosen commented Feb 20, 2025