[BUG] JSMA massive gpu memory consumption #187

Dontoronto · 2024-06-10T23:13:52Z

✨ Short description of the bug [tl;dr]

today i tried to run jsma on an imagenet sample. Sample had the shape (1,3,224,224). JSMA code stuck a little bit in the approximation and then an error message popped up writing "JSMA needs to allocate 84,.. GiB of gpu memory" while my nvidia only had 6gb.
When looking into the code i could see a lot of clones, inits etc. which costs a lot of memory, computation device transfers etc. I think some smarter guys than me could be able to optimize the code to work on lower memory consumption.

💬 Detailed code and results

Traceback (most recent call last):
File "C:\Users\Domin\anaconda3\envs\NeuronalNetwork\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Users\Domin\anaconda3\envs\NeuronalNetwork\lib\site-packages\torchattacks\attacks\jsma.py", line 116, in saliency_map
alpha = target_tmp.view(-1, 1, nb_features) + target_tmp.view(
File "C:\Users\Domin\anaconda3\envs\NeuronalNetwork\lib\site-packages\torch\utils_device.py", line 78, in torch_function
return func(*args, **kwargs)
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 84.41 GiB. GPU 0 has a total capacity of 6.00 GiB of which 4.21 GiB is free. Of the allocated memory
704.63 MiB is allocated by PyTorch, and 29.37 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLO
C_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-vari
ables)

rikonaka · 2024-06-11T08:53:31Z

Hi @Dontoronto , have you tested your NVIDIA device for other attacks such as PGD or CW? Based on the fact that you're trying to attack imagenet on your only 6GB device, so I'm not quite sure if it's because your GPU is too tiny or if it's a problem with the code.

Dontoronto · 2024-06-11T14:32:29Z

yes i tested it. currently I'm running deepfool and pgd attacks without any problems. the problem occures at this line
File "C:\Users\Domin\anaconda3\envs\NeuronalNetwork\lib\site-packages\torchattacks\attacks\jsma.py", line 116, in saliency_map alpha = target_tmp.view(-1, 1, nb_features) + target_tmp.view(

the variable nb_features has 150528 parameters inside, because of the flattened sample from imagenet. I don't know if this is really a bug or my setup is just too low.

rikonaka · 2024-06-11T14:48:33Z

yes i tested it. currently I'm running deepfool and pgd attacks without any problems. the problem occures at this line File "C:\Users\Domin\anaconda3\envs\NeuronalNetwork\lib\site-packages\torchattacks\attacks\jsma.py", line 116, in saliency_map alpha = target_tmp.view(-1, 1, nb_features) + target_tmp.view(

the variable nb_features has 150528 parameters inside, because of the flattened sample from imagenet. I don't know if this is really a bug or my setup is just too low.

Roger that, I'm going to do some testing and debugging to try to find the problem and fix it! 😘

Dontoronto · 2024-06-11T16:38:20Z

i would like to give more information but my computer is currently generating a deepfool dataset. Thank you very much!😃

rikonaka · 2024-06-12T09:34:23Z

i would like to give more information but my computer is currently generating a deepfool dataset. Thank you very much!😃

It seem that I have found the cause of the problem, due to an overly large dimension of the input tensor in the calculation of the Jacobi matrix.

def compute_jacobian(model, x):
    def model_forward(input):
        return model(input)
    jacobian = torch.autograd.functional.jacobian(model_forward, x)
    return jacobian

In the above code, even if I just input 3 images (from ImageNet), its GPU memory usage reaches 11G, and 5 => 16G, 6 => 36G.

So even if batch_size is set to 10, it still requires close to 80 GB+ of GPU memory on ImageNet dataset.

I'll try to improve the algorithm and try to make it work on ImageNet!

Dontoronto · 2024-06-15T15:12:54Z

you are awesome! i really appreciate your effort :)

rikonaka · 2024-06-23T14:36:15Z

Hi @Dontoronto , on a bad note, I've been trying to reduce memory consumption on ImageNet for a while now and have rewritten the whole code for the JSMA attack 8c065ec, but I've found that this seems to be an unattainable goal.

Here are my reasons why.

First, according to the original JSMA attack paper, Algorithm 2 and Algorithm 3

The JSMA attack will try to travel all (p1, p2) pairs of tau, and the tau in ImageNet is 3 * 224 * 224, a very large number, so for p1 and p2, there will be (3 * 224 * 224)^2 combinations to look up. On a very small dataset, this lookup is possible, but on ImageNet, this lookup matrix will be unbelievably huge leading to a huge consumption of GPU memory.

Second, when computing SM (Saliency Map), we need to run the addition once for each element in the matrix, an operation that is O(n^2) memory consuming. You read that right, it is indeed O(n^2). I think that's kind of an inherent disadvantage of JSMA. 10 ImageNet images will be (10, 150528) => (10, 150528, 150528), backward propagation on such a large matrix is very memory intensive.

In the end, this is actually not bug, and if you are planning to run JSMA attacks on ImageNet, as my experimental equipment is not wireless, I tried to run it on a server with 150GB of RAM but couldn't succeed with these attacks, you can try a server with more than 200GB of RAM 😂. If you're successful remember to get back to me on how much RAM you ended up using on the server!

Dontoronto · 2024-06-25T01:27:21Z

@rikonaka Sorry for causing you so much work. All things you mentioned sound plausible. I just stepped over this while generating samples for my thesis. Unfortunately I just have a 6gb gpu and can't use jsma for the imagenet case. I try to use OnePixel to get L0 attacks.
Thank you very much! Do I need to close this issue or do you close it? Idk if you still have something in your mind regarding this :)

rikonaka · 2024-06-25T01:31:24Z

You can close this issues, if I have an update I'll comment below!👍

Dontoronto added the bug Something isn't working label Jun 10, 2024

rikonaka mentioned this issue Jun 23, 2024

CW efficiency improvement and bug fix, add CW binary search version, early stop PGD version, support L0 and Linf for CW and CWBS, rewrite FAB attack, fix MI-FGSM bug, rewrite JSMA. #168

Open

15 tasks

Dontoronto closed this as completed Jun 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] JSMA massive gpu memory consumption #187

[BUG] JSMA massive gpu memory consumption #187

Dontoronto commented Jun 10, 2024

rikonaka commented Jun 11, 2024 •

edited

Loading

Dontoronto commented Jun 11, 2024

rikonaka commented Jun 11, 2024

Dontoronto commented Jun 11, 2024

rikonaka commented Jun 12, 2024 •

edited

Loading

Dontoronto commented Jun 15, 2024

rikonaka commented Jun 23, 2024 •

edited

Loading

Dontoronto commented Jun 25, 2024

rikonaka commented Jun 25, 2024

[BUG] JSMA massive gpu memory consumption #187

[BUG] JSMA massive gpu memory consumption #187

Comments

Dontoronto commented Jun 10, 2024

✨ Short description of the bug [tl;dr]

💬 Detailed code and results

rikonaka commented Jun 11, 2024 • edited Loading

Dontoronto commented Jun 11, 2024

rikonaka commented Jun 11, 2024

Dontoronto commented Jun 11, 2024

rikonaka commented Jun 12, 2024 • edited Loading

Dontoronto commented Jun 15, 2024

rikonaka commented Jun 23, 2024 • edited Loading

Dontoronto commented Jun 25, 2024

rikonaka commented Jun 25, 2024

rikonaka commented Jun 11, 2024 •

edited

Loading

rikonaka commented Jun 12, 2024 •

edited

Loading

rikonaka commented Jun 23, 2024 •

edited

Loading