`top_k_top_p_sampling_from_probs` ocassionally output invalid token_ids #384

hnyls2002 · 2024-07-20T07:13:54Z

In some dummy input_ids, there are some unexpected sampled results

tensor([     29919,        903,      30140, 1061800448,        903]

The text was updated successfully, but these errors were encountered:

hnyls2002 · 2024-07-20T07:14:58Z

Even in the dummy input case, the error is hard to reproduce...

yzh119 · 2024-07-20T08:28:47Z

what's the output of success?

ispobock · 2024-07-20T15:13:31Z

@yzh119 @hnyls2002 I got a reproducible example:

import torch
from flashinfer.sampling import top_k_top_p_sampling_from_probs

probs = torch.zeros([1, 32000]).cuda()
probs[0][1] = 1
max_top_k_round, batch_size = 32, probs.shape[0]
top_ks = torch.Tensor([1073741824]).to(torch.int32).cuda()
top_ps = torch.Tensor([1.]).cuda()
for i in range(10):
    uniform_samples = torch.rand((max_top_k_round, batch_size), device=probs.device)
    batch_next_token_ids, _ = top_k_top_p_sampling_from_probs(probs, uniform_samples, top_ks, top_ps)
    print(batch_next_token_ids)

output:

tensor([0], device='cuda:0', dtype=torch.int32)
tensor([1062357376], device='cuda:0', dtype=torch.int32)
tensor([0], device='cuda:0', dtype=torch.int32)
tensor([1056581825], device='cuda:0', dtype=torch.int32)
tensor([1058078609], device='cuda:0', dtype=torch.int32)
tensor([1052083673], device='cuda:0', dtype=torch.int32)
tensor([1065198239], device='cuda:0', dtype=torch.int32)
tensor([1032416633], device='cuda:0', dtype=torch.int32)
tensor([1062065655], device='cuda:0', dtype=torch.int32)
tensor([1058903188], device='cuda:0', dtype=torch.int32)

yzh119 · 2024-07-20T15:51:39Z

Thank you I'll fix it soon.

yzh119 · 2024-07-20T23:09:19Z

@ispobock Unfortunately I cannot reproduce your error locally, I always got 1 as the sampling output.

Could you find a fixed random seed (I tried a lot of them) that can reproduce this error?

yzh119 · 2024-07-20T23:18:56Z

btw, your example is kind of weird because we expect top_ks and top_ps to have shape (batch_size,), but in your example they have shape (1,) which mismatches batch_size (32).

ispobock · 2024-07-20T23:32:45Z

@yzh119 The batch_size is assigned by probs.shape[0], and probs.shape[0]=1 in this example.
It may be environment dependent, I can reproduce it with torch==2.3.0+cu118 and flashinfer==0.1.0+cu118torch2.3 on cuda 11.8.

ispobock · 2024-07-20T23:50:33Z

The issue can not be reproduced when I tried on cuda 12.2 environment with torch==2.3.0+cu121 and 0.1.0+cu121torch2.3.

yzh119 · 2024-07-20T23:55:56Z

Okay that might be the reason, for cu118 we use FlagHeads rather than SubtractLeft because of the cub version: #265

Their semantics might be different, let me double check.

yzh119 · 2024-07-21T00:16:23Z

I think I have figured out the reason, unlike SubtractLeft, the input and output argument in FlagHeads should be different:

    BlockAdjacentDifference<bool, BLOCK_THREADS>(temp_storage->block_prim.adj_diff)
        .FlagHeads<VEC_SIZE>(greater_than_u, greater_than_u, BoolDiffOp());

would result in undefined behavior, we should use different variables:

    BlockAdjacentDifference<bool, BLOCK_THREADS>(temp_storage->block_prim.adj_diff)
        .FlagHeads<VEC_SIZE>(greater_than_u_out, greater_than_u, BoolDiffOp());

After this change, I don't observe weird output anymore.

hnyls2002 · 2024-07-21T00:23:45Z

I got this error on cuda 12.4

As observed in #384 , we should use different variables for input and output for `FlagHeads` API in cu118.

yzh119 · 2024-07-21T00:32:40Z

@ispobock The cu118 issue should have been fixed in main branch.

@hnyls2002 can you find a reproducible script?

ispobock · 2024-07-21T01:25:45Z

@ispobock The cu118 issue should have been fixed in main branch.

@yzh119 It works for me now! Thanks!

yzh119 · 2024-08-02T08:18:40Z

I got this error on cuda 12.4

@hnyls2002 we switched to a deterministic implementation: #417, maybe it can address the issue.

yzh119 · 2024-08-11T08:03:25Z

Lots of potential bugfix PRs have been merged recently. I close this first and feel free to open it again if you still observe such issues.

ispobock mentioned this issue Jul 20, 2024

Tmp fix illegal sample sgl-project/sglang#676

Merged

ispobock mentioned this issue Jul 20, 2024

Fallback when sampling failed sgl-project/sglang#678

Merged

yzh119 mentioned this issue Jul 21, 2024

bugfix: fix sampling API's behavior on cu118 #386

Merged

yzh119 added a commit that referenced this issue Jul 21, 2024

bugfix: fix sampling API's behavior on cu118 (#386)

0cd4994

As observed in #384 , we should use different variables for input and output for `FlagHeads` API in cu118.

merrymercy mentioned this issue Jul 27, 2024

[Bug] generate wrong sequences with higher temperature sgl-project/sglang#771

Closed

3 tasks

yzh119 closed this as completed Aug 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`top_k_top_p_sampling_from_probs` ocassionally output invalid token_ids #384

`top_k_top_p_sampling_from_probs` ocassionally output invalid token_ids #384

hnyls2002 commented Jul 20, 2024

hnyls2002 commented Jul 20, 2024

yzh119 commented Jul 20, 2024

ispobock commented Jul 20, 2024 •

edited

Loading

yzh119 commented Jul 20, 2024

yzh119 commented Jul 20, 2024

yzh119 commented Jul 20, 2024

ispobock commented Jul 20, 2024 •

edited

Loading

ispobock commented Jul 20, 2024

yzh119 commented Jul 20, 2024

yzh119 commented Jul 21, 2024 •

edited

Loading

hnyls2002 commented Jul 21, 2024

yzh119 commented Jul 21, 2024

ispobock commented Jul 21, 2024

yzh119 commented Aug 2, 2024

yzh119 commented Aug 11, 2024

top_k_top_p_sampling_from_probs ocassionally output invalid token_ids #384

top_k_top_p_sampling_from_probs ocassionally output invalid token_ids #384

Comments

hnyls2002 commented Jul 20, 2024

hnyls2002 commented Jul 20, 2024

yzh119 commented Jul 20, 2024

ispobock commented Jul 20, 2024 • edited Loading

yzh119 commented Jul 20, 2024

yzh119 commented Jul 20, 2024

yzh119 commented Jul 20, 2024

ispobock commented Jul 20, 2024 • edited Loading

ispobock commented Jul 20, 2024

yzh119 commented Jul 20, 2024

yzh119 commented Jul 21, 2024 • edited Loading

hnyls2002 commented Jul 21, 2024

yzh119 commented Jul 21, 2024

ispobock commented Jul 21, 2024

yzh119 commented Aug 2, 2024

yzh119 commented Aug 11, 2024

`top_k_top_p_sampling_from_probs` ocassionally output invalid token_ids #384

`top_k_top_p_sampling_from_probs` ocassionally output invalid token_ids #384

ispobock commented Jul 20, 2024 •

edited

Loading

ispobock commented Jul 20, 2024 •

edited

Loading

yzh119 commented Jul 21, 2024 •

edited

Loading