Skip to content

Why vllm store fp8 kv cache as uint8_t, but not torch.float8_e4m3fn or torch.float8_e5m2? #10911

Unanswered
cyLi-Tiger asked this question in Q&A
Discussion options

You must be logged in to vote

Replies: 0 comments

# for free to join this conversation on GitHub. Already have an account? # to comment
Category
Q&A
Labels
None yet
1 participant