We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
$ module li Currently Loaded Modules: 1) craype-x86-milan 3) craype-network-ofi 5) PrgEnv-gnu/8.5.0 7) cray-libsci/23.12.5 9) craype/2.7.30 11) perftools-base/23.12.0 13) cudatoolkit/12.2 15) gpu/1.0 2) libfabric/1.15.2.0 4) xpmem/2.6.2-2.5_2.38__gd067c3f.shasta 6) cray-dsmml/0.2.2 8) cray-mpich/8.1.28 10) gcc-native/12.3 12) cpe/23.12 14) craype-accel-nvidia80
$ cat doConfigPerlKk.sh bdir=$PWD/build-kokkos cmake -S kokkos -B $bdir \ -DCMAKE_BUILD_TYPE=Release \ -DBUILD_SHARED_LIBS=ON \ -DCRAYPE_LINK_TYPE=dynamic \ -DCMAKE_CXX_COMPILER=$PWD/kokkos/bin/nvcc_wrapper \ -DKokkos_ARCH_AMPERE80=ON \ -DKokkos_ENABLE_SERIAL=ON \ -DKokkos_ENABLE_OPENMP=off \ -DKokkos_ENABLE_CUDA=on \ -DKokkos_ENABLE_CUDA_LAMBDA=on \ -DKokkos_ENABLE_DEBUG=off \ -DCMAKE_INSTALL_PREFIX=$bdir/install
$ cat doConfigPerlOmegah.sh #!/bin/bash -ex usage="Usage: $0 <mpi=on|off> <cudaAware=on|off>" [[ $# -ne 2 ]] && echo $usage && exit 1 mpi=$1 [[ $mpi != "on" && $mpi != "off" ]] && echo $usage && exit 1 cudaAware=$2 [[ $cudaAware != "on" && $cudaAware != "off" ]] && echo $usage && exit 1 bdir=$PWD/build-omegah-mpi${mpi}-cudaAware${cudaAware} cmake -S omega_h -B $bdir \ -DCMAKE_INSTALL_PREFIX=$bdir/install \ -DCMAKE_BUILD_TYPE=Release \ -DBUILD_SHARED_LIBS=on \ -DOmega_h_USE_Kokkos=on \ -DOmega_h_CUDA_ARCH=80 \ -DOmega_h_USE_MPI=$mpi \ -DOmega_h_USE_CUDA_AWARE_MPI=$cudaAware \ -DBUILD_TESTING=on \ -DCMAKE_CXX_COMPILER=CC
Download the Omega_h delta wing meshes: https://zenodo.org/records/10672130
$ cat submitP2.sh sbatch --nodes 1 --qos regular --time 00:10:00 --constraint gpu --gpus 4 --account=PROJECT_NAME ./runP2.sh
$ cat runP2.sh #!/bin/bash bin_cudaAwareOff=/pscratch/sd/c/cwsmith/omegahDeltaWingAdapt/twoGpus/build-omegah-mpion-cudaAwareoff/src bin_cudaAwareOn=/pscratch/sd/c/cwsmith/omegahDeltaWingAdapt/twoGpus/build-omegah-mpion-cudaAwareon/src mesh=/pscratch/sd/c/cwsmith/omegahDeltaWingAdapt/twoGpus/deltaWing_500kMetric_p2.osh cmd="$bin_cudaAwareOff/ugawg_hsc_oshmeshload --osh-pool $mesh" export MPICH_GPU_SUPPORT_ENABLED=0 set -x srun -n 2 $cmd &> log2p_cudaAwareOff set +x cmd="$bin_cudaAwareOn/ugawg_hsc_oshmeshload --osh-pool $mesh" export MPICH_GPU_SUPPORT_ENABLED=1 set -x srun -n 2 $cmd &> log2p_cudaAwareOn set +x
$ cat log2p_cudaAwareOn (GTL DEBUG: 0) cuIpcGetMemHandle: invalid argument, CUDA_ERROR_INVALID_VALUE, line no 148 MPICH ERROR [Rank 0] [job id 22622708.1] [Wed Mar 6 07:48:56 2024] [nid002241] - Abort(606713346) (rank 0 in comm 0): Fatal error in PMPI_Isend: Invalid count, error stack: PMPI_Isend(161)......................: MPI_Isend(buf=0x623196f88, count=2382, MPI_INT, dest=1, tag=42, comm=0xc4000000, request=0x23c3f34) failed MPID_Isend(584)......................: MPIDI_isend_unsafe(136)..............: MPIDI_SHM_mpi_isend(323).............: MPIDI_CRAY_Common_lmt_isend(84)......: MPIDI_CRAY_Common_lmt_export_mem(103): (unknown)(): Invalid count aborting job: Fatal error in PMPI_Isend: Invalid count, error stack: PMPI_Isend(161)......................: MPI_Isend(buf=0x623196f88, count=2382, MPI_INT, dest=1, tag=42, comm=0xc4000000, request=0x23c3f34) failed MPID_Isend(584)......................: MPIDI_isend_unsafe(136)..............: MPIDI_SHM_mpi_isend(323).............: MPIDI_CRAY_Common_lmt_isend(84)......: MPIDI_CRAY_Common_lmt_export_mem(103): (unknown)(): Invalid count Kokkos::Cuda ERROR: Failed to call Kokkos::Cuda::finalize() srun: error: nid002241: task 0: Exited with exit code 255 srun: Terminating StepId=22622708.1 slurmstepd: error: *** STEP 22622708.1 ON nid002241 CANCELLED AT 2024-03-06T15:48:58 *** srun: error: nid002241: task 1: Terminated srun: Force Terminated StepId=22622708.1
The text was updated successfully, but these errors were encountered:
No branches or pull requests
environment
versions
build
run
Download the Omega_h delta wing meshes: https://zenodo.org/records/10672130
error
The text was updated successfully, but these errors were encountered: