-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
top_k_top_p_sampling_from_probs
ocassionally output invalid token_ids
#384
Comments
Even in the dummy input case, the error is hard to reproduce... |
what's the output of |
@yzh119 @hnyls2002 I got a reproducible example: import torch
from flashinfer.sampling import top_k_top_p_sampling_from_probs
probs = torch.zeros([1, 32000]).cuda()
probs[0][1] = 1
max_top_k_round, batch_size = 32, probs.shape[0]
top_ks = torch.Tensor([1073741824]).to(torch.int32).cuda()
top_ps = torch.Tensor([1.]).cuda()
for i in range(10):
uniform_samples = torch.rand((max_top_k_round, batch_size), device=probs.device)
batch_next_token_ids, _ = top_k_top_p_sampling_from_probs(probs, uniform_samples, top_ks, top_ps)
print(batch_next_token_ids) output:
|
Thank you I'll fix it soon. |
@ispobock Unfortunately I cannot reproduce your error locally, I always got Could you find a fixed random seed (I tried a lot of them) that can reproduce this error? |
btw, your example is kind of weird because we expect |
@yzh119 The |
The issue can not be reproduced when I tried on cuda 12.2 environment with |
Okay that might be the reason, for cu118 we use Their semantics might be different, let me double check. |
I think I have figured out the reason, unlike BlockAdjacentDifference<bool, BLOCK_THREADS>(temp_storage->block_prim.adj_diff)
.FlagHeads<VEC_SIZE>(greater_than_u, greater_than_u, BoolDiffOp()); would result in undefined behavior, we should use different variables: BlockAdjacentDifference<bool, BLOCK_THREADS>(temp_storage->block_prim.adj_diff)
.FlagHeads<VEC_SIZE>(greater_than_u_out, greater_than_u, BoolDiffOp()); After this change, I don't observe weird output anymore. |
I got this error on cuda 12.4 |
As observed in #384 , we should use different variables for input and output for `FlagHeads` API in cu118.
@ispobock The cu118 issue should have been fixed in main branch. @hnyls2002 can you find a reproducible script? |
@hnyls2002 we switched to a deterministic implementation: #417, maybe it can address the issue. |
Lots of potential bugfix PRs have been merged recently. I close this first and feel free to open it again if you still observe such issues. |
In some dummy input_ids, there are some unexpected sampled results
The text was updated successfully, but these errors were encountered: