Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add new benchmark: Nvidia HPCG #326

Open
1 of 5 tasks
giordano opened this issue Jun 3, 2024 · 6 comments
Open
1 of 5 tasks

Add new benchmark: Nvidia HPCG #326

giordano opened this issue Jun 3, 2024 · 6 comments
Assignees

Comments

@giordano
Copy link
Member

giordano commented Jun 3, 2024

The Nvidia HPCG benchmark used to available only via containers, but it was open-sourced last week and it's available at https://github.com/NVIDIA/nvidia-hpcg. This works on both CPU (x86-64 and aarch64, with specific optimisations for Nvidia Grace, aka Neoverse V2) and GPU, and also mixed CPU-GPU configuration (maybe only on Grace-Hopper/Blackwell systems?).

@christophermaynard you may be interested in this.

  • Build NVIDIA HPCG locally (with USE_CUDA =0) Running on x86 CPU is not supported (unless offloading to an Ampere GPU)
  • Build on locust (with CUDA)
  • Collect data on locust and cricket
  • Develop spack package (NCCL is it's own package nccl, cuSparse and cuBlas are included in the cuda package).
  • Develop ReFrame benchmark
@tkoskela
Copy link
Member

This might be something @krishnakumarg1984 would be interested in

@tkoskela
Copy link
Member

tkoskela commented Oct 4, 2024

@krishnakumarg1984 started working on this. I added tasks in the main comment

@giordano
Copy link
Member Author

giordano commented Oct 4, 2024

How do we get cuSparse and cuBlas?

Those should presumably be in the cuda package? For example py-cupy claims to use those libraries

@tkoskela
Copy link
Member

tkoskela commented Oct 4, 2024

Those should presumably be in the cuda package?

That'd be great! It wasn't clear to me when I looked at it yesterday

@giordano
Copy link
Member Author

giordano commented Oct 4, 2024

https://github.com/spack/spack/blob/905e7b9b4559691fb1fdde3340f33ef0214e60dc/var/spack/repos/builtin/packages/kokkos-kernels/package.py#L147-L166 when for this package the variants cusparse and cublas are enabled, the cuda package is added as dependency. This seems to settle it.

@tkoskela
Copy link
Member

Achievements so far

  • Built on locust
    • Required lots of manual tweaking to link to CUDA and other Nvidia libraries
    • Both CPU and GPU version build
  • Ran CPU version on single core
    • By default problem size is huge, runs slowly
    • Did not run to completion
  • Ran GPU version on single GPU

What we want to achieve short term

  • Run CPU version on whole node
  • Strip down MPI affinity options
  • Make a comparison with vanilla hpcg on Grace (and other CPUs)

What we want to achieve long term (next TI)

  • Integrate into ReFrame

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

3 participants