Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Successfully Running **ck_version** and **sycl_version** of Soil Mechanics #715

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

ShuangLi-1
Copy link
Collaborator

Based on your previous guidance, I have re-debugged and corrected the ck_version and sycl_version of the soil mechanics module.

The issue was caused by a division by zero in the constitutive equation calculations, which I have now fixed. Thanks for your guidance.

Additionally, I have created two cases:

  • 2d_column_collapse_ck
  • 2d_column_collapse_sycl

On my computer, these cases can run successfully in Release mode, and the GPU is being utilized correctly. Please check them at your convenience.

Supplementary Notes

  1. For matrix transformation purposes, I copied Pull Request inlined vector up and degrade #703 (inline vector up and degrade) into my current version.
  2. This is a new account created using my Google email. I will use this account moving forward.

Simulation Results and GPU Utilization

GPU Computation
Figure 1: GPU utilization during computation.

Simulation Results
Figure 2: Simulation results.

CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
@ShuangLi-1
Copy link
Collaborator Author

@Xiangyu-Hu
Thank you for your prompt response and guidance.
I have updated my code based on the latest master branch, made the necessary adjustments, and tested it. The corrected version avoids any inconsistencies with the current version and successfully runs both the CK and SYCL versions of the soil mechanics.
The changes have been updated in the pull request.

Thank you again for your support during the holiday season, and I wish you a Merry Christmas!

@ShuangLi-1 ShuangLi-1 marked this pull request as ready for review December 27, 2024 12:07
@ShuangLi-1 ShuangLi-1 marked this pull request as draft December 30, 2024 02:55
@Xiangyu-Hu
Copy link
Owner

Xiangyu-Hu commented Jan 1, 2025

@ShuangLi-1 Linux (g++) build is quite strict, and requires even no warning message. You can try to build using g++ locally first.

…rors where the initialization order is different from the definition order
@ShuangLi-1
Copy link
Collaborator Author

@ShuangLi-1 Linux (g++) build is quite strict, and requires even no warning message. You can try to build using g++ locally first.
Ok, I will try it. Thanks

@ShuangLi-1
Copy link
Collaborator Author

@ShuangLi-1 Linux (g++) build is quite strict, and requires even no warning message. You can try to build using g++ locally first.

I have modified the code to avoid the issue where the initialization order of variables differs from the definition order, as mentioned in the error message in this PR.

Locally, I can successfully compile and run 2d_column_collapse_ck with g++ and the -Werror=reorder flag, as shown in the screenshot.

Perhaps further testing can be performed.

1735730364927
Figure. The CmakePreset and results

@ShuangLi-1 ShuangLi-1 marked this pull request as ready for review January 2, 2025 06:39
<< interval_computing_time_step.seconds() << "\n";
std::cout << std::fixed << std::setprecision(9) << "interval_updating_configuration = "
<< interval_updating_configuration.seconds() << "\n";

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need have a regression test like

if (sph_system.GenerateRegressionData())
{
write_mechanical_energy.generateDataBase(1.0e-3);
}
else if (sph_system.RestartStep() == 0)
{
write_mechanical_energy.testResult();
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added it and pushed it. But I don't know how to test it on my own computer.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to copy the data files to here from the CPU version of this case.

tests/2d_examples/test_2d_column_collapse/regression_test_tool

So that the regression test can be carried out automatically when you run the case.

@ShuangLi-1
Copy link
Collaborator Author

ShuangLi-1 commented Jan 22, 2025

@Xiangyu-Hu

(1) I am testing the efficiency of the GPU version of Soil Dynamics.

The platform I am using is Ubuntu 20.04,
with the following hardware specifications:

  • CPU: Intel(R) Xeon(R) Platinum 8377C CPU @ 3.00GHz (2 sockets, 32 cores per socket, with hyper-threading)
  • GPU: GTX 1030, 2GB VRAM

CUDA Version: 12.2

(2) I performed tests on the 2D example 2d_column_collapse.

For the same resolution (dp = 0.002 m, approximately 5000 particles), the results for the CK and SYCL versions are as follows:

CPU Version:

  • Based on SPHSystem sph_system(system_domain_bounds, resolution_ref, 128); to limit the number of threads.
Threads Computation Time (s)
128 46.0
64 29.7
32 33.6
16 45.2

GPU Version:

  • No pre-set threads: computation time = 72.0s

Considering the hardware differences, the GPU computation is relatively more efficient.

(3) For further efficiency testing, I modified the 3D example 3d_repose_angle for both CK and SYCL versions.

Currently, the CK version runs and tests normally, but the SYCL version throws the following error:


[build] [100%] Linking CXX executable bin/test_3d_repose_angle_sycl
[build] fatal error: error in backend: Cannot select: 0x5ebe70bdb9c0: f64 = fsin nnan ninf nsz arcp contract afn reassoc 0x5ebe702d0cb0
[build] 0x5ebe702d0cb0: f64,ch = CopyFromReg 0x5ebe70a36190, Register:f64 %10
[build] 0x5ebe70bdb870: f64 = Register %10
[build] In function: ZTSZZN3SPH12particle_forINS_7SPHBodyEZNS_21InteractionDynamicsCKIJNS_9execution20ParallelDevicePolicyENS_4BaseENS_18continuum_dynamics17StressDiffusionCKIJNS_5InnerIJEEEEEEEE14runInteractionEfEUlmE_EEvRKNS_11LoopRangeCKIJS4_T_EEERKT0_ENKUlRN4sycl3_V17handlerEE_clESO_EUlNSM_7nd_itemILi1EEEE
[build] llvm-foreach:
[build] icpx: error: clang frontend command failed with exit code 70 (use -v to see invocation)
[build] Intel(R) oneAPI DPC++/C++ Compiler 2024.0.0 (2024.0.0.20231017)
[build] Target: x86_64-unknown-linux-gnu
[build] Thread model: posix
[build] InstalledDir: /opt/intel/oneapi/compiler/2024.0/bin/compiler
[build] Configuration file: /opt/intel/oneapi/compiler/2024.0/bin/compiler/../icpx.cfg
[build] icpx: note: diagnostic msg: Error generating preprocessed source(s).
[build] gmake[3]: *** [tests/tests_sycl/3d_examples/test_3d_repose_angle_sycl/CMakeFiles/test_3d_repose_angle_sycl.dir/build.make:115:tests/tests_sycl/3d_examples/test_3d_repose_angle_sycl/bin/test_3d_repose_angle_sycl] 错误 1
[build] gmake[2]: *** [CMakeFiles/Makefile2:5872:tests/tests_sycl/3d_examples/test_3d_repose_angle_sycl/CMakeFiles/test_3d_repose_angle_sycl.dir/all] 错误 2
[build] gmake[1]: *** [CMakeFiles/Makefile2:5879:tests/tests_sycl/3d_examples/test_3d_repose_angle_sycl/CMakeFiles/test_3d_repose_angle_sycl.dir/rule] 错误 2
[build] gmake: *** [Makefile:1474:test_3d_repose_angle_sycl] 错误 2
[proc] The command: /usr/local/bin/cmake --build /home/li/Myfork/SPHinXsys-GPU_soil/build --parallel 130 --target test_3d_repose_angle_sycl -- exited with code: 2
[driver] Build completed: 00:00:54.286
[build] Build finished with exit code 2

From my research, the issue seems to be related to the inability of the sin() function to handle double precision calculations, but I have not yet found a solution.
I would like to ask for your suggestions on how to address this issue.

@Xiangyu-Hu
Copy link
Owner

@Xiangyu-Hu

(1) I am testing the efficiency of the GPU version of Soil Dynamics.

The platform I am using is Ubuntu 20.04, with the following hardware specifications:

  • CPU: Intel(R) Xeon(R) Platinum 8377C CPU @ 3.00GHz (2 sockets, 32 cores per socket, with hyper-threading)
  • GPU: GTX 1030, 2GB VRAM

CUDA Version: 12.2

(2) I performed tests on the 2D example 2d_column_collapse.

For the same resolution (dp = 0.002 m, approximately 5000 particles), the results for the CK and SYCL versions are as follows:

CPU Version:

  • Based on SPHSystem sph_system(system_domain_bounds, resolution_ref, 128); to limit the number of threads.

Threads Computation Time (s)
128 46.0
64 29.7
32 33.6
16 45.2
GPU Version:

  • No pre-set threads: computation time = 72.0s

Considering the hardware differences, the GPU computation is relatively more efficient.

(3) For further efficiency testing, I modified the 3D example 3d_repose_angle for both CK and SYCL versions.

Currently, the CK version runs and tests normally, but the SYCL version throws the following error:

” [build] [100%] Linking CXX executable bin/test_3d_repose_angle_sycl [build] fatal error: error in backend: Cannot select: 0x5ebe70bdb9c0: f64 = fsin nnan ninf nsz arcp contract afn reassoc 0x5ebe702d0cb0 [build] 0x5ebe702d0cb0: f64,ch = CopyFromReg 0x5ebe70a36190, Register:f64 %10 [build] 0x5ebe70bdb870: f64 = Register %10 [build] In function: ZTSZZN3SPH12particle_forINS_7SPHBodyEZNS_21InteractionDynamicsCKIJNS_9execution20ParallelDevicePolicyENS_4BaseENS_18continuum_dynamics17StressDiffusionCKIJNS_5InnerIJEEEEEEEE14runInteractionEfEUlmE_EEvRKNS_11LoopRangeCKIJS4_T_EEERKT0_ENKUlRN4sycl3_V17handlerEE_clESO_EUlNSM_7nd_itemILi1EEEE [build] llvm-foreach: [build] icpx: error: clang frontend command failed with exit code 70 (use -v to see invocation) [build] Intel(R) oneAPI DPC++/C++ Compiler 2024.0.0 (2024.0.0.20231017) [build] Target: x86_64-unknown-linux-gnu [build] Thread model: posix [build] InstalledDir: /opt/intel/oneapi/compiler/2024.0/bin/compiler [build] Configuration file: /opt/intel/oneapi/compiler/2024.0/bin/compiler/../icpx.cfg [build] icpx: note: diagnostic msg: Error generating preprocessed source(s). [build] gmake[3]: *** [tests/tests_sycl/3d_examples/test_3d_repose_angle_sycl/CMakeFiles/test_3d_repose_angle_sycl.dir/build.make:115:tests/tests_sycl/3d_examples/test_3d_repose_angle_sycl/bin/test_3d_repose_angle_sycl] 错误 1 [build] gmake[2]: *** [CMakeFiles/Makefile2:5872:tests/tests_sycl/3d_examples/test_3d_repose_angle_sycl/CMakeFiles/test_3d_repose_angle_sycl.dir/all] 错误 2 [build] gmake[1]: *** [CMakeFiles/Makefile2:5879:tests/tests_sycl/3d_examples/test_3d_repose_angle_sycl/CMakeFiles/test_3d_repose_angle_sycl.dir/rule] 错误 2 [build] gmake: *** [Makefile:1474:test_3d_repose_angle_sycl] 错误 2 [proc] The command: /usr/local/bin/cmake --build /home/li/Myfork/SPHinXsys-GPU_soil/build --parallel 130 --target test_3d_repose_angle_sycl -- exited with code: 2 [driver] Build completed: 00:00:54.286 [build] Build finished with exit code 2 “

From my research, the issue seems to be related to the inability of the sin() function to handle double precision calculations, but I have not yet found a solution. I would like to ask for your suggestions on how to address this issue.

First, I think performance test needs at least million particles, 5000 particles gives nothing meaningful.
Second, the error seems a linking issue. Is the 3d test uses the same code the 2d case? I do not think the error is related to double precision computing of sin function, in sycl version we enforced the usage of float point real type , no double type will show up.

@Xiangyu-Hu
Copy link
Owner

Xiangyu-Hu commented Jan 24, 2025

@ShuangLi-1 you can refer a solution from: https://chatgpt.com by input the follows.

  1. In sycl could we directly use math function such as sin()?
  2. could I use namespace alias to choose std and sycl accordingly?

The namespace alias can be used here:

#if SPHINXSYS_USE_FLOAT
using Real = float;
using UnsignedInt = u_int32_t;
#else
using Real = double;
using UnsignedInt = size_t;
#endif // SPHINXSYS_USE_FLOAT

in case chatgpt is not accessible, you can use https://chat.deepseek.com, the answer is almost the same.

@Xiangyu-Hu Xiangyu-Hu added the sycl_ck for heterogenous parallelism label Jan 27, 2025
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
sycl_ck for heterogenous parallelism
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants