Skip to content

Solving the 4D Vlasov-Poisson system. Parallelized with OpenACC or Kokkos.

License

Notifications You must be signed in to change notification settings

yasahi-hpc/vlp4d

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

About

The vlp4d code solves Vlasov-Poisson equations in 4D (2d space, 2d velocity). From the numerical point of view, vlp4d is based on a semi-lagrangian scheme. Vlasov solver is typically based on a directional Strang splitting. The Poisson equation is treated with 2D Fourier transforms. For the sake of simplicity, all directions are, for the moment, handled with periodic boundary conditions.

The Vlasov solver is based on advection's operators:

  • 1D advection along x (Dt/2)
  • 1D advection along y (Dt/2)
  • Poisson solver -> compute electric fields Ex and E
  • 1D advection along vx (Dt)
  • 1D advection along vy (Dt)
  • 1D advection along x (Dt/2)
  • 1D advection along y (Dt/2)

Interpolation operator within advection is Lagrange polynomial of order 5, 7 depending on a compilation flag (order 5 by default).

Detailed descriptions of the test cases can be found in

For questions or comments, please find us in the AUTHORS file.

HPC

From the view point of high perfomrance computing (HPC), the code is parallelized with OpenMP without MPI domain decomposition. In order to investigate the performance portability of this kind of kinietic plasma simulation codes, we implement the mini-app with a mixed OpenACC/OpenMP and Kokkos, where we suppress unnecessary duplications of code lines. The detailed description and obtained performance is found in

Test environments

We have tested the code on the following environments.

  • Nvidia Tesla p100 on Tsubame3.0 (Tokyo Tech, Japan)
    Compilers (cuda/8.0.61, pgi19.1)

  • Nvidia Tesla v100 on Summit (OLCF, US)
    Compilers (cuda/10.1.168, pgi19.1)

  • Intel Skylake on JFRS-1 (IFERC-CSC, Japan)
    Compilers (intel19.0.0.117)

  • Marvell Thunder X2 on CEA Computing Complex (CEA, France)
    Compilers (armclang19.2.0)

Usage

Compile

Depending on your configuration, you may have to modify the Makefile. You may add your configuration in the same way as

ifneq (,$(findstring p100,$(DEVICES)))
CXXFLAGS=-O3 -I/apps/t3/sles12sp2/cuda/8.0.61/include -ta=nvidia:cc60 -Minfo -std=c++11 -DOWN_INDEX_SEQUENCE -DNO_ASSERT_IN_CONSTEXPR -DENABLE_OPENACC
CXX=pgc++
LDFLAGS = -Mcudalib=cufft -ta=nvidia:cc60 -acc
TARGET = vlp4d.p100_acc
endif

OpenACC version

export DEVICE=device_name # choose the device_name from "p100", "v100", "bdw", "skx", "tx2"
cd src_openacc
make

OpenMP4.5 version

export DEVICE=device_name # choose the device_name from "v100"
cd src_openmp4.5
make

Kokkos version

First of all, you need to install kokkos on your environment. Instructions are found in https://github.com/kokkos/kokkos. In the following example, it is assumed that kokkos is located at "your_kokkos_path".

export KOKKOS_PATH=your_kokkos_path # set your_kokkos_path
export DEVICE=device_name # choose the device_name from "p100", "v100", "bdw", "skx", "tx2"
export RANGE_POLICY=3D # optional, in case using MDRangePolicy3D for the better performance
cd src_kokkos
make

Test

Depending on your configuration, you may have to modify the job.sh in wk and sub_*.sh in wk/batch_scripts.

cd wk
./job.sh
gnuplot -e 'plot "nrj.out" u 2 w l, "nrj_SLD10" u 2; pause -1' 

To checkout if results are OK, the nrj curve should be close enough to nrj_SLD10.
For the performance measurement to reproduce the results in SC paper, you should change the argment in the bash script from "SLD10.dat" to "SLD10_large.dat". For example, in wk/batch_scripts/sub_p100_kokkos.sh, the last line should be changed as follows.

Original (Before change)
./vlp4d.p100_kokkos SLD10.dat
SC19 (After change)
./vlp4d.p100_kokkos SLD10_large.dat

You can also try the two beam instability by setting the argument as "TSI20.dat".

About

Solving the 4D Vlasov-Poisson system. Parallelized with OpenACC or Kokkos.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published