Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add MPS and XPU devices #125

Closed
wants to merge 2 commits into from

Conversation

ElliottKasoar
Copy link
Contributor

Adds device options for MPS (Apple GPU) and XPU (Intel GPU), similarly to the addition of GPUs via CUDA.

In theory there are quite a few additional devices we could add (full list here / here), but these two are of most interest from discussions with @jatkinson1000.

I haven't been able to test the XPU device, but basic tests with MPS seem to suggest it's working as expected:

In example 2, resnet_infer_fortran, setting:

model = torch_module_load(args(1), device_type=torch_kMPS)

without changing the input tensor device throws an error:

RuntimeError: slow_conv2d_forward_mps: input(device='cpu') and weight(device=mps:0')  must be on the same device

Similarly, setting the input tensor device, but not the model

in_tensor(1) = torch_tensor_from_array(in_data, in_layout, torch_kMPS)

throws an error:

RuntimeError: Input type (MPSFloatType) and weight type (CPUFloatType) should be the same

Setting both works and the expected output is produced:

Samoyed (id=         259 ), : probability =  0.884624064

I also see spikes in activity on my GPU (for the largest spikes, I added a loop around the example inference):

image

Note, when running 10,000 iterations of the inference, I got an error:

RuntimeError: MPS backend out of memory (MPS allocated: 45.89 GB, other allocations: 9.72 MB, max allowed: 45.90 GB). Tried to allocate 784.00 KB on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure).

which might suggest a problem with cleanup.

I don't think this is specific to MPS, so might be worth checking on GPU too (you can reduce the CUDA memory to debug more easily, if it helps).

@ElliottKasoar ElliottKasoar added the enhancement New feature or request label May 6, 2024
@jatkinson1000
Copy link
Member

Potentially closes #127 which is an issue opened in relation to this PR.

@jwallwork23
Copy link
Contributor

Closing as superseded by #276.

@ElliottKasoar ElliottKasoar deleted the add-devices branch February 17, 2025 20:33
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

XPU and MPS support
4 participants