XPU and MPS take 3 #276

jwallwork23 · 2025-02-06T12:02:17Z

Closes #127.
Builds upon #125 and #209.
(Contains changes from #268 so that will need to be merged first.)

This PR adds support XPU and MPS. Unfortunately, it ended up requiring an overhaul of the pt2ts scripts, too.

Notable changes:

Switching from ENABLE_CUDA to the more general and extensible GPU_DEVICE=<NONE/CUDA/XPU/MPS>.
Pre-processor directives for handling different GPU types.
Support for XPU under GPU device code 12.
Support for MPS under GPU device code 13.
Use of argparse for reading command line arguments into Python scripts, rather than sys.argv.
Updates to docs.

Checklist

Test on 2 Nvidia GPUs
Test on 2 XPUs
Test on 1 MPS device

jwallwork23 · 2025-02-10T15:30:28Z

.github/workflows/static_analysis.yml

-          . ftorch_venv/bin/activate  # Uses .clang-tidy config file if present
-          fortitude check src/
+          . ftorch_venv/bin/activate
+          fortitude check --ignore=E001,T041 src/ftorch.F90


To get the Fortran linting check to pass, I had to ignore errors of the form reported in https://github.com/Cambridge-ICCS/FTorch/actions/runs/13240946802/job/36956002877, as well as ones of the form

src/ftorch.F90:153:57: T041 'tensor_shape' has assumed size | 151 | type(c_ptr), value, intent(in) :: data 152 | integer(c_int), value, intent(in) :: ndims 153 | integer(c_int64_t), intent(in) :: tensor_shape(*) | ^ T041 154 | integer(c_int64_t), intent(in) :: strides(*) 155 | integer(c_int), value, intent(in) :: dtype

because we established that the cases where we use assumed size in ftorch.F90 are required.

jwallwork23 · 2025-02-10T16:22:24Z

Offline testing for CUDA version of MultiGPU example with 2 devices passed on Ampere. In the queue for XPU testing on PVC.

jatkinson1000

Thanks @jwallwork23 I have only done a quick pass of this so far and will need to schedule time for a closer look, but I suspect you know my first comment - can you update the docs at utils/README etc. to reflect the new args and usage of pt2ts?

I would also like to see it documented somewhere how the device enums are managed from CMake. Perhaps under the developer docs. As, whilst a very nifty solution, it's slightly abstract if you are not the one who came up with it 😉

jatkinson1000 · 2025-02-11T12:00:17Z

Also looks like you may want a rebase after #268

jwallwork23 · 2025-02-12T09:27:44Z

Thanks @jwallwork23 I have only done a quick pass of this so far and will need to schedule time for a closer look, but I suspect you know my first comment - can you update the docs at utils/README etc. to reflect the new args and usage of pt2ts?

Oh, I'd forgotten about that README actually. Addressed here 50bd953.

I would also like to see it documented somewhere how the device enums are managed from CMake. Perhaps under the developer docs. As, whilst a very nifty solution, it's slightly abstract if you are not the one who came up with it 😉

Ah yes... done in 838f3ae.

Also looks like you may want a rebase after #268

Done!

jwallwork23 · 2025-02-12T09:48:38Z

I don't understand why the clang-tidy errors didn't crop up previously...?

For one of them, I can fix with

+#include <stdbool.h>

For the device related one, I could fix with

+// NOTE: These need to be defined here for clang-tidy to pass
+#ifndef GPU_DEVICE_NONE
+#define GPU_DEVICE_NONE 0
+#endif
+#ifndef GPU_DEVICE_CUDA
+#define GPU_DEVICE_CUDA 1
+#endif
+#ifndef GPU_DEVICE_XPU
+#define GPU_DEVICE_XPU 12
+#endif
+#ifndef GPU_DEVICE_MPS
+#define GPU_DEVICE_MPS 13
+#endif

(although would prefer to avoid duplicating these codes)

But the other torch/script one - is clang-tidy meant to be compiler-aware or something?

jatkinson1000 · 2025-02-13T10:40:54Z

I have just run this on a Mac using Mps and made the relevant tweaks to get it to work.
@jwallwork23 if you can check the diff in c863cc7 to see if you are happy that would be great. Hopefully everything should still work OK on XPU/CUDA though I had to adjust how num_devices was set.

I'll look to schedule time to review the rest of this.

…gs for running fortran on all devices.

examples/3_MultiGPU/README.md

jwallwork23

A few very minor suggestions but otherwise those additions looks good - thanks for testing on MPS @jatkinson1000. Still waiting for my XPU job to run.

Co-authored-by: Joe Wallwork <22053413+jwallwork23@users.noreply.github.com>

jwallwork23 · 2025-02-14T10:59:42Z

Good news - test passed on 2 XPU devices on Dawn! 🥳

jwallwork23 · 2025-02-17T09:05:32Z

Re-tested on Dawn with latest version of branch - all good

ElliottKasoar · 2025-02-17T20:53:54Z

README.md

@@ -165,7 +165,7 @@ To build and install the library:
    | [`CMAKE_INSTALL_PREFIX`](https://cmake.org/cmake/help/latest/variable/CMAKE_INSTALL_PREFIX.html)  | `</path/to/install/lib/at/>` | Location at which the library files should be installed. By default this is `/usr/local` |
    | [`CMAKE_BUILD_TYPE`](https://cmake.org/cmake/help/latest/variable/CMAKE_BUILD_TYPE.html)          | `Release` / `Debug`          | Specifies build type. The default is `Debug`, use `Release` for production code|
    | `CMAKE_BUILD_TESTS`                                                                               | `TRUE` / `FALSE`             | Specifies whether to compile FTorch's [test suite](https://cambridge-iccs.github.io/FTorch/page/testing.html) as part of the build. |
-    | `ENABLE_CUDA`                                                                                     | `TRUE` / `FALSE`             | Specifies whether to check for and enable CUDA<sup>3</sup> |
+    | `GPU_DEVICE` | `NONE` / `CUDA` / `XPU` / `MPS` | Specifies the target GPU architecture (if any) <sup>3</sup> |


It's probably worth at least mentioning Intel/Mac GPUs in the GPU Support section of the README too?

Ah, good point! Addressed in 0897b20.

jwallwork23 added enhancement New feature or request gpu Related to buiding and running on GPU labels Feb 6, 2025

jwallwork23 self-assigned this Feb 6, 2025

jwallwork23 force-pushed the 127_xpu-take3 branch from 4aa9a4a to 55b929f Compare February 7, 2025 13:47

jwallwork23 commented Feb 10, 2025

View reviewed changes

jwallwork23 marked this pull request as ready for review February 10, 2025 16:21

jwallwork23 requested review from jatkinson1000 and TomMelt February 10, 2025 16:21

jwallwork23 requested a review from ma595 February 10, 2025 16:22

This was referenced Feb 11, 2025

Add MPS and XPU devices #125

Closed

Add XPU support (duplicate #125) #209

Closed

jatkinson1000 requested changes Feb 11, 2025

View reviewed changes

ElliottKasoar and others added 16 commits February 11, 2025 12:09

Add MacOS GPU device option

f3a1914

Add XPU device option

42bcd61

Update C++ XPU interface to handle multiple devices indices.

7039084

Update ftorch.F90 for XPU support

2a30f6a

Make device enums consistent with PyTorch

f58fa92

Add ENABLE_XPU option for CMakeLists

0e5e56e

Move towards generalising device type in MultiGPU example

32a2b42

Accept command line arguments in MultiGPU example

1c47839

Account for MPS

9adf424

Account for XPU in MultiGPU README

1cb8cfd

Lint

999e1c8

Fix argparse syntax

47e4276

Pre-processing for different GPU devices

a422df3

Add mps to options for MultiGPU simplenet

017f36e

Introduce GPU_DEVICE preprocessor option

00e6a36

CMake lint

ba99847

jwallwork23 added 10 commits February 11, 2025 12:10

GPU_DEVICE OFF implies NONE

978a2ad

Better handling of GPU device

240c3cc

CMake lint

5be69f0

Use GPU_DEVICE in unit tests, too

ece46ca

Add lint ignores for ftorch.F90

f4205ca

Update pt2ts scripts; use argparse over argv

5873eb7

Update GPU docs

61e054e

Update READMEs

7de01ec

Add subsection on command line args to utils README

50bd953

Add explanation of GPU device codes in dev docs

838f3ae

jwallwork23 force-pushed the 127_xpu-take3 branch from e6f8779 to 838f3ae Compare February 12, 2025 09:26

jwallwork23 requested a review from jatkinson1000 February 12, 2025 09:27

Tweak example 3 to run on Mps devices. Bugfix in README for device ar…

c863cc7

…gs for running fortran on all devices.

jatkinson1000 force-pushed the 127_xpu-take3 branch from 4149316 to c863cc7 Compare February 13, 2025 10:44

jwallwork23 commented Feb 13, 2025

View reviewed changes

examples/3_MultiGPU/README.md Outdated Show resolved Hide resolved

jwallwork23 commented Feb 13, 2025

View reviewed changes

examples/3_MultiGPU/README.md Outdated Show resolved Hide resolved

jwallwork23 commented Feb 13, 2025

View reviewed changes

examples/3_MultiGPU/README.md Outdated Show resolved Hide resolved

jwallwork23 commented Feb 13, 2025

View reviewed changes

Typographical changes from @jwallwork23's review

85c4723

Co-authored-by: Joe Wallwork <22053413+jwallwork23@users.noreply.github.com>

This was referenced Feb 13, 2025

Add instructions for a conda environment on mac #284

Open

[REVIEW]: FTorch: a library for coupling PyTorch models to Fortran openjournals/joss-reviews#7602

Open

Fix build path static analysis

8ff6be2

ElliottKasoar reviewed Feb 17, 2025

View reviewed changes

Respond to @ElliotKasoar review

0897b20

jwallwork23 requested a review from ElliottKasoar February 18, 2025 10:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

XPU and MPS take 3 #276

XPU and MPS take 3 #276

jwallwork23 commented Feb 6, 2025 •

edited

Loading

jwallwork23 Feb 10, 2025

jwallwork23 commented Feb 10, 2025

jatkinson1000 left a comment •

edited

Loading

jatkinson1000 commented Feb 11, 2025

jwallwork23 commented Feb 12, 2025

jwallwork23 commented Feb 12, 2025 •

edited

Loading

jatkinson1000 commented Feb 13, 2025 •

edited

Loading

jwallwork23 left a comment

jwallwork23 commented Feb 14, 2025

jwallwork23 commented Feb 17, 2025

ElliottKasoar Feb 17, 2025

jwallwork23 Feb 18, 2025

XPU and MPS take 3 #276

Are you sure you want to change the base?

XPU and MPS take 3 #276

Conversation

jwallwork23 commented Feb 6, 2025 • edited Loading

Checklist

jwallwork23 Feb 10, 2025

Choose a reason for hiding this comment

jwallwork23 commented Feb 10, 2025

jatkinson1000 left a comment • edited Loading

Choose a reason for hiding this comment

jatkinson1000 commented Feb 11, 2025

jwallwork23 commented Feb 12, 2025

jwallwork23 commented Feb 12, 2025 • edited Loading

jatkinson1000 commented Feb 13, 2025 • edited Loading

jwallwork23 left a comment

Choose a reason for hiding this comment

jwallwork23 commented Feb 14, 2025

jwallwork23 commented Feb 17, 2025

ElliottKasoar Feb 17, 2025

Choose a reason for hiding this comment

jwallwork23 Feb 18, 2025

Choose a reason for hiding this comment

jwallwork23 commented Feb 6, 2025 •

edited

Loading

jatkinson1000 left a comment •

edited

Loading

jwallwork23 commented Feb 12, 2025 •

edited

Loading

jatkinson1000 commented Feb 13, 2025 •

edited

Loading