Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Kernel cache #336

Open
vosen opened this issue Feb 20, 2025 · 0 comments
Open

Kernel cache #336

vosen opened this issue Feb 20, 2025 · 0 comments
Labels
help wanted Extra attention is needed planned This is part of the roadmap

Comments

@vosen
Copy link
Owner

vosen commented Feb 20, 2025

Expected outcome:
After PTX module have been compiled once, it gets cached on disk in an SQLite database. Before kernel gets compiled we look in cache to check if the kernel has been previously compiled

Comments:

  • Key should be hash (BLAKE3) of module text + compiler version + ZLUDA version + device (gfxXXXX) + flags (debug/release, windows/linux, compiler switches)
  • We will want to eventually support mechanism similar to CUDA_​CACHE_​MAXSIZE. This does not have to be implemented yet, but db should at least contain necessary information for evicting cache: time of last use for each entry and total size of all the kernels
  • You can get compile version from comgr by running a preprocessor on a file containing preprocessor directive that resolves to full clang version. It contains version and ROCm LLVM hash
  • ZLUDA version should be current git commit hash. There are several crates for it, last time I used vergen
@vosen vosen added help wanted Extra attention is needed planned This is part of the roadmap labels Feb 20, 2025
@vosen vosen added this to the llm.c (no FlashAttention) milestone Feb 20, 2025
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
help wanted Extra attention is needed planned This is part of the roadmap
Projects
None yet
Development

No branches or pull requests

1 participant