Skip to content

Commit

Permalink
REMOVE GLOBAL AND SHARED code for wavefunctions (issue #7).
Browse files Browse the repository at this point in the history
Keep only the "local" w[5][6] and focus on that.

time ./gcheck.exe -p 65536 128 1
***************************************
NumIterations             = 1
NumThreadsPerBlock        = 128
NumBlocksPerGrid          = 65536
---------------------------------------
FP precision              = DOUBLE (nan=0)
Complex type              = THRUST::COMPLEX
RanNumb memory layout     = AOSOA[4]
Momenta memory layout     = AOSOA[4]
Wavefunction GPU memory   = LOCAL
Curand generation         = DEVICE (CUDA code)
---------------------------------------
NumberOfEntries           = 1
TotalTimeInWaveFuncs      = 1.147885e-02 sec
MeanTimeInWaveFuncs       = 1.147885e-02 sec
StdDevTimeInWaveFuncs     = 0.000000e+00 sec
MinTimeInWaveFuncs        = 1.147885e-02 sec
MaxTimeInWaveFuncs        = 1.147885e-02 sec
---------------------------------------
TotalEventsComputed       = 8388608
RamboEventsPerSec         = 8.288389e+07 sec^-1
MatrixElemEventsPerSec    = 7.307881e+08 sec^-1
***************************************
NumMatrixElements(notNan) = 8388608
MeanMatrixElemValue       = 1.371734e-02 GeV^0
StdErrMatrixElemValue     = 2.831148e-06 GeV^0
StdDevMatrixElemValue     = 8.199880e-03 GeV^0
MinMatrixElemValue        = 6.071582e-03 GeV^0
MaxMatrixElemValue        = 3.374925e-02 GeV^0
***************************************
00 CudaFree : 0.145482 sec
0a ProcInit : 0.000564 sec
0b MemAlloc : 0.650950 sec
0c GenCreat : 0.014307 sec
1a GenSeed  : 0.000006 sec
1b GenRnGen : 0.000689 sec
2a RamboIni : 0.000024 sec
2b RamboFin : 0.000006 sec
2c CpDTHwgt : 0.008415 sec
2d CpDTHmom : 0.092765 sec
3a SGoodHel : 0.024875 sec
3b SigmaKin : 0.000018 sec
3c CpDTHmes : 0.011461 sec
4a DumpLoop : 0.030057 sec
9a DumpAll  : 0.031446 sec
9b GenDestr : 0.000062 sec
9c MemFree  : 0.274781 sec
9d CudReset : 0.044059 sec
TOTAL       : 1.329966 sec
TOTAL(n-2)  : 1.140425 sec
***************************************
real    0m1.341s
user    0m0.220s
sys     0m1.113s

time ./check.exe -p 65536 128 1
***************************************
NumIterations             = 1
NumThreadsPerBlock        = 128
NumBlocksPerGrid          = 65536
---------------------------------------
FP precision              = DOUBLE (nan=0)
Complex type              = STD::COMPLEX
RanNumb memory layout     = AOSOA[4]
Momenta memory layout     = AOSOA[4]
Curand generation         = HOST (C++ code)
---------------------------------------
NumberOfEntries           = 1
TotalTimeInWaveFuncs      = 2.303040e+01 sec
MeanTimeInWaveFuncs       = 2.303040e+01 sec
StdDevTimeInWaveFuncs     = 0.000000e+00 sec
MinTimeInWaveFuncs        = 2.303040e+01 sec
MaxTimeInWaveFuncs        = 2.303040e+01 sec
---------------------------------------
TotalEventsComputed       = 8388608
RamboEventsPerSec         = 2.971090e+06 sec^-1
MatrixElemEventsPerSec    = 3.642407e+05 sec^-1
***************************************
NumMatrixElements(notNan) = 8388608
MeanMatrixElemValue       = 1.371734e-02 GeV^0
StdErrMatrixElemValue     = 2.831148e-06 GeV^0
StdDevMatrixElemValue     = 8.199880e-03 GeV^0
MinMatrixElemValue        = 6.071582e-03 GeV^0
MaxMatrixElemValue        = 3.374925e-02 GeV^0
***************************************
0a ProcInit : 0.000398 sec
0b MemAlloc : 1.254668 sec
0c GenCreat : 0.000983 sec
1a GenSeed  : 0.000003 sec
1b GenRnGen : 0.455943 sec
2a RamboIni : 0.128670 sec
2b RamboFin : 2.694741 sec
3b SigmaKin : 23.030397 sec
4a DumpLoop : 0.016648 sec
9a DumpAll  : 0.032200 sec
9b GenDestr : 0.000091 sec
9c MemFree  : 0.143718 sec
TOTAL       : 27.758461 sec
TOTAL(n-2)  : 27.614344 sec
***************************************
real    0m27.765s
user    0m26.679s
sys     0m1.067s
  • Loading branch information
valassi committed Aug 18, 2020
1 parent 5d2c85c commit d1a5097
Show file tree
Hide file tree
Showing 4 changed files with 40 additions and 348 deletions.
Loading

0 comments on commit d1a5097

Please # to comment.