Invalid results of type 1 transform into (64, 64, 64) grid on A100 GPU #575

pavel-shmakov · 2024-10-15T10:30:12Z

We've encountered an issue where cufinufft.nufft3d1 outputs wildly incorrect results for very specific inputs and only on certain GPUs. This can be reproduced by running the following code on an A100 GPU:

import torch
import cufinufft

points = torch.load("points.pt")
values = torch.load("values.pt")
spectrum = cufinufft.nufft3d1(
        *points,
        values,
        (64, 64, 64),
        isign=-1,
        eps=1e-06
)
print(torch.linalg.norm(spectrum).item())

Here's an archive with points.pt and values.pt: inputs.zip

The value is many orders of magnitude greater than it should be. It also grows quickly with decreasing eps.

Notes:

We reproduced this both for cufinufft 2.2.0 and 2.3.0.
Reproduced on A100, but not on A10G. We haven't tried other GPUs.
The "blow-up" happens for specific grid sizes: from 61 to 64, while for 60, 65 and beyond it goes back to normal. This is for float32 inputs; for float64, we saw a "blow-up" for grid size 32.
We compiled cufinufft from sources to investigate further, but surprisingly couldn't reproduce the bug. We've tried compiling from master and v2.3.X as well as various compilation options. If you could point us to the compilation options with which the release version of libcufinufft.so is built, that would be helpful, and we can investigate further!

The text was updated successfully, but these errors were encountered:

pavel-shmakov · 2024-10-15T12:59:25Z

Smaller reproducer with just one point:

batch_size = 32
v = torch.tensor([[1] for i in range(batch_size)], dtype=torch.complex64, device="cuda")
p = torch.tensor([[0], [0], [0]], dtype=torch.float32, device='cuda')
spectrum = cufinufft.nufft3d1(*p, v, (64, 64, 64), eps=1e-6)

The spectra should be 1 everywhere, which it is for batch_size < 16. For batch_size >= 16 it starts misbehaving.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Invalid results of type 1 transform into (64, 64, 64) grid on A100 GPU #575

Invalid results of type 1 transform into (64, 64, 64) grid on A100 GPU #575

pavel-shmakov commented Oct 15, 2024 •

edited

Loading

pavel-shmakov commented Oct 15, 2024

Invalid results of type 1 transform into (64, 64, 64) grid on A100 GPU #575

Invalid results of type 1 transform into (64, 64, 64) grid on A100 GPU #575

Comments

pavel-shmakov commented Oct 15, 2024 • edited Loading

pavel-shmakov commented Oct 15, 2024

pavel-shmakov commented Oct 15, 2024 •

edited

Loading