Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Does make_topog need MAXXGRID, nthreads? #35

Closed
sanAkel opened this issue Dec 22, 2020 · 11 comments
Closed

Does make_topog need MAXXGRID, nthreads? #35

sanAkel opened this issue Dec 22, 2020 · 11 comments
Labels

Comments

@sanAkel
Copy link

sanAkel commented Dec 22, 2020

Is your question related to a problem? Please describe.

I am trying to generate a topography:

make_topog --mosaic ocean_mosaic.nc \
           --topog_type realistic \
           --topog_file ${topo_file} \
           --topog_field ${topo_var} \
           --bottom_depth 7500.0 \
           --min_depth 10.0 \
           --scale_factor -1 \
           --verbose \
           --output ocean_topog.nc

And I get following output:

NOTE from make_topog ==> the topog_type is: realistic
NOTE from make_topog ==> x_refine is 2, y_refine is 2


 ************************************************************

NOTE from make_topog ==> input arguments

NOTE from make_topog ==> min_depth is: 10.000000
NOTE from make_topog ==> topog_file is: /discover/nobackup/sakella/MOM6-GFDL/Grids/etopo1/etopo1_n.nc
NOTE from make_topog ==> topog_field is: z
NOTE from make_topog ==> scale_factor is: -1.000000
NOTE from make_topog ==> num_filter_pass is: 1
NOTE from make_topog ==> kmt_min is 2
NOTE from make_topog ==> fraction_full_cell is 0.200000
NOTE from make_topog ==> min_thickness is 0.100000
NOTE from make_topog ==> no vgrid_file is specified
NOTE from make_topog ==> not allow non-advective tracer cells
NOTE from make_topog ==> open this cell
NOTE from make_topog ==> adjust topography


 ************************************************************

**************************************************
Begin to generate topography

==>NOTE from get_boundary_type: x_boundary_type is cyclic

==>NOTE from get_boundary_type: y_boundary_type is fold_north_edge
FATAL Error: nxgrid is greater than MAXXGRID/nthreads, increase MAXXGRID, decrease nthreads, or increase number of MPI ranks
Abort(-1) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, -1) - process 0

Describe what you have tried

ocean_hgrid.nc was generated with:

make_hgrid --grid_type tripolar_grid \
           --nxbnd 2 --nybnd 2 \
           --xbnd -280,80 --ybnd -90,90 \
           --dlon 5.0,5.0 --dlat 5.0,5.0 \
           --center c_cell \
           --grid_name ocean_hgrid

And then used that to generate the ocean_mosaic.nc

make_solo_mosaic --num_tiles 1 \
                 --dir . \
                 --tile_file ocean_hgrid.nc \
                 --periodx 360. \
                 --mosaic_name ocean_mosaic

I get the exact same error with make_topog_parallel as well. I also tried mpirun -np 10 make_topog_parallel ... all args as above.

@mathomp4 would be able to fill in the build details on NCCS NASA. Thank you @mathomp4

Appreciate your help, Thanks!

@mathomp4
Copy link

The build on NCCS was with Intel 19.1.3 and Intel MPI 19.1.3

@mathomp4
Copy link

mathomp4 commented Dec 22, 2020

Also, on our system OMP_NUM_THREADS defaults to 1.

ETA: I also tried his code with mpirun -np 120 and it threw the same error. Should we be using/needing thousands of cores?

@sanAkel
Copy link
Author

sanAkel commented Dec 23, 2020

Just to report further on what we tried, increasing MAXXGRID to say 1e10

does solve the issue and it produces a topography file that looks okay!

Note:

FYI: The other option of increasing nthreads by adding it to the make_topog args, for e.g., --nthreads 4
threw an error: no such option! Anyway as already said (above #35 (comment)) it is set to 1.

@ngs333
Copy link
Contributor

ngs333 commented Dec 24, 2020

@sanAkel @mathomp4
Sorry I did not get to this earlier. When you compile you can add to the CPPFLAGS or the CFLAGS
the option -DMAXXGRID=1e10 . Also, I believe you can specify the number of MPI ranks (say to 8) as in this example:
mpirun -n 8 make_topog --mosaic ocean_mosaic.nc the_rest_of_the_program_arguments.

@sanAkel
Copy link
Author

sanAkel commented Dec 24, 2020

Thank you @ngs333 I tried following:

I reverted to: #define MAXXGRID 1e6 in tools/libfrencutils/create_xgrid.h
and added -DMAXXGRID=1e10 to following Makefiles:

  • the one that I got generated after running configure in the top level dir (FRE-NCtools/) so that CFLAGS = -g -O2 -DMAXXGRID=1e10
  • Also the same as above for tools/libfrencutils/Makefile

Does it need to be set in both of those Makefiles??

Moreover, it crashed! Totally puzzled! 😕 (For sanity sake, I went back to #35 (comment) and it worked fine as it did then and produced identical answer.)

NOTE from make_topog ==> the topog_type is: realistic
NOTE from make_topog ==> x_refine is 2, y_refine is 2


 ************************************************************

NOTE from make_topog ==> input arguments

NOTE from make_topog ==> min_depth is: 10.000000
NOTE from make_topog ==> topog_file is: /discover/nobackup/sakella/MOM6-GFDL/Grids/etopo2/ETOPO2v2g_f4_n.nc
NOTE from make_topog ==> topog_field is: z
NOTE from make_topog ==> scale_factor is: -1.000000
NOTE from make_topog ==> num_filter_pass is: 1
NOTE from make_topog ==> kmt_min is 2
NOTE from make_topog ==> fraction_full_cell is 0.200000
NOTE from make_topog ==> min_thickness is 0.100000
NOTE from make_topog ==> no vgrid_file is specified
NOTE from make_topog ==>Make cells less than minimum depth land.
NOTE from make_topog ==> not allow non-advective tracer cells
NOTE from make_topog ==> open this cell
NOTE from make_topog ==> adjust topography


 ************************************************************

**************************************************
Begin to generate topography 

==>NOTE from get_boundary_type: x_boundary_type is cyclic

==>NOTE from get_boundary_type: y_boundary_type is fold_north_edge

 --> MISSING = 9999999790214767953607394487959552.000000

 --> using great cicle: 0
*** Error in `/discover/nobackup/sakella/MOM6-GFDL/2020Dec24-ifort/bin/make_topog': free(): invalid size: 0x00002aab1d22e010 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x740ef)[0x2aaaac16b0ef]
/lib64/libc.so.6(+0x79646)[0x2aaaac170646]
/lib64/libc.so.6(+0x7a393)[0x2aaaac171393]
/discover/nobackup/sakella/MOM6-GFDL/2020Dec24-ifort/bin/make_topog[0x43b7bf]
/discover/nobackup/sakella/MOM6-GFDL/2020Dec24-ifort/bin/make_topog[0x41a46a]
/discover/nobackup/sakella/MOM6-GFDL/2020Dec24-ifort/bin/make_topog[0x411f81]
/discover/nobackup/sakella/MOM6-GFDL/2020Dec24-ifort/bin/make_topog[0x40b053]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x2aaaac117725]
/discover/nobackup/sakella/MOM6-GFDL/2020Dec24-ifort/bin/make_topog[0x408f69]
======= Memory map: ========
00400000-00a7a000 r-xp 00000000 00:3e 121328252                          /gpfsm/dnb02/sakella/MOM6-GFDL/2020Dec24-ifort/bin/make_topog
00c79000-00c86000 r--p 00679000 00:3e 121328252                          /gpfsm/dnb02/sakella/MOM6-GFDL/2020Dec24-ifort/bin/make_topog
00c86000-00cab000 rw-p 00686000 00:3e 121328252                          /gpfsm/dnb02/sakella/MOM6-GFDL/2020Dec24-ifort/bin/make_topog
00cab000-0a7e9000 rw-p 00000000 00:00 0                                  [heap]
2aaaaaaab000-2aaaaaacc000 r-xp 00000000 fe:00 33575023                   /lib64/ld-2.22.so
2aaaaaacc000-2aaaaaad0000 r--p 00000000 00:00 0                          [vvar]
2aaaaaad0000-2aaaaaad2000 r-xp 00000000 00:00 0                          [vdso]
2aaaaaad2000-2aaaaaad5000 rw-p 00000000 00:00 0 
2aaaaaaf5000-2aaaaab9f000 rw-p 00000000 00:00 0 
2aaaaac20000-2aaaaacc4000 rw-p 00000000 00:00 0 
2aaaaaccc000-2aaaaaccd000 r--p 00021000 fe:00 33575023                   /lib64/ld-2.22.so
2aaaaaccd000-2aaaaacce000 rw-p 00022000 fe:00 33575023                   /lib64/ld-2.22.so
2aaaaacce000-2aaaaaccf000 rw-p 00000000 00:00 0 
2aaaaaccf000-2aaaaadca000 r-xp 00000000 fe:00 33575003                   /lib64/libm-2.22.so
2aaaaadca000-2aaaaafca000 ---p 000fb000 fe:00 33575003                   /lib64/libm-2.22.so
2aaaaafca000-2aaaaafcb000 r--p 000fb000 fe:00 33575003                   /lib64/libm-2.22.so
2aaaaafcb000-2aaaaafcc000 rw-p 000fc000 fe:00 33575003                   /lib64/libm-2.22.so
2aaaaafcc000-2aaaaafe0000 r-xp 00000000 fe:00 33886879                   /usr/lpp/mmfs/lib/libgpfs.so
2aaaaafe0000-2aaaab1df000 ---p 00014000 fe:00 33886879                   /usr/lpp/mmfs/lib/libgpfs.so
2aaaab1df000-2aaaab1e0000 r--p 00013000 fe:00 33886879                   /usr/lpp/mmfs/lib/libgpfs.so
2aaaab1e0000-2aaaab1e1000 rw-p 00014000 fe:00 33886879                   /usr/lpp/mmfs/lib/libgpfs.so
2aaaab1e1000-2aaaab242000 r-xp 00000000 fe:00 33575064                   /lib64/libssl.so.1.0.0
2aaaab242000-2aaaab441000 ---p 00061000 fe:00 33575064                   /lib64/libssl.so.1.0.0
2aaaab441000-2aaaab445000 r--p 00060000 fe:00 33575064                   /lib64/libssl.so.1.0.0
2aaaab445000-2aaaab44c000 rw-p 00064000 fe:00 33575064                   /lib64/libssl.so.1.0.0
2aaaab44c000-2aaaab67b000 r-xp 00000000 fe:00 33575063                   /lib64/libcrypto.so.1.0.0
2aaaab67b000-2aaaab87b000 ---p 0022f000 fe:00 33575063                   /lib64/libcrypto.so.1.0.0
2aaaab87b000-2aaaab896000 r--p 0022f000 fe:00 33575063                   /lib64/libcrypto.so.1.0.0
2aaaab896000-2aaaab8a4000 rw-p 0024a000 fe:00 33575063                   /lib64/libcrypto.so.1.0.0
2aaaab8a4000-2aaaab8a8000 rw-p 00000000 00:00 0 
2aaaab8a8000-2aaaab8bd000 r-xp 00000000 fe:00 33574984                   /lib64/libz.so.1.2.8
2aaaab8bd000-2aaaababc000 ---p 00015000 fe:00 33574984                   /lib64/libz.so.1.2.8
2aaaababc000-2aaaababd000 r--p 00014000 fe:00 33574984                   /lib64/libz.so.1.2.8
2aaaababd000-2aaaababe000 rw-p 00015000 fe:00 33574984                   /lib64/libz.so.1.2.8
2aaaababe000-2aaaabac0000 r-xp 00000000 fe:00 33575102                   /lib64/libdl-2.22.so
2aaaabac0000-2aaaabcc0000 ---p 00002000 fe:00 33575102                   /lib64/libdl-2.22.so
2aaaabcc0000-2aaaabcc1000 r--p 00002000 fe:00 33575102                   /lib64/libdl-2.22.so
2aaaabcc1000-2aaaabcc2000 rw-p 00003000 fe:00 33575102                   /lib64/libdl-2.22.so
2aaaabcc2000-2aaaabcd9000 r-xp 00000000 00:3d 14172588                   /gpfsm/dulocal/sles12/other/gcc/8.3.0/lib64/libgcc_s.so.1
2aaaabcd9000-2aaaabed8000 ---p 00017000 00:3d 14172588                   /gpfsm/dulocal/sles12/other/gcc/8.3.0/lib64/libgcc_s.so.1
2aaaabed8000-2aaaabed9000 r--p 00016000 00:3d 14172588                   /gpfsm/dulocal/sles12/other/gcc/8.3.0/lib64/libgcc_s.so.1
2aaaabed9000-2aaaabeda000 rw-p 00017000 00:3d 14172588                   /gpfsm/dulocal/sles12/other/gcc/8.3.0/lib64/libgcc_s.so.1
2aaaabeda000-2aaaabef2000 r-xp 00000000 fe:00 33574979                   /lib64/libpthread-2.22.so
2aaaabef2000-2aaaac0f1000 ---p 00018000 fe:00 33574979                   /lib64/libpthread-2.22.so
2aaaac0f1000-2aaaac0f2000 r--p 00017000 fe:00 33574979                   /lib64/libpthread-2.22.so
2aaaac0f2000-2aaaac0f3000 rw-p 00018000 fe:00 33574979                   /lib64/libpthread-2.22.so
2aaaac0f3000-2aaaac0f7000 rw-p 00000000 00:00 0 
2aaaac0f7000-2aaaac292000 r-xp 00000000 fe:00 33575026                   /lib64/libc-2.22.so
2aaaac292000-2aaaac492000 ---p 0019b000 fe:00 33575026                   /lib64/libc-2.22.so
2aaaac492000-2aaaac496000 r--p 0019b000 fe:00 33575026                   /lib64/libc-2.22.so
2aaaac496000-2aaaac498000 rw-p 0019f000 fe:00 33575026                   /lib64/libc-2.22.so
2aaaac498000-2aaaac4bf000 rw-p 00000000 00:00 0 
2aaaac6c0000-2aab38f40000 rw-p 00000000 00:00 0 
2aab3c000000-2aab3c021000 rw-p 00000000 00:00 0 
2aab3c021000-2aab40000000 ---p 00000000 00:00 0 
7ffffffdb000-7ffffffff000 rw-p 00000000 00:00 0                          [stack]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
Abort (core dumped)

@ngs333
Copy link
Contributor

ngs333 commented Dec 24, 2020

@sanAkel
I'm glad you have something working. I thought I'd give you all the steps I use on a Linux system with gcc to make the tools in case you decide to try it again. Not everyone does it this way, though.

  1. Setting up the environment. I run this bash shell :
#!/bin/bash
export FC=mpifort
export F77=mpifort
export CC=mpicc
export CXX=mpiCC
export FCFLAGS="-fcray-pointer -Waliasing -ffree-line-length-none -fno-range-check -O2 -g -W -fbounds-check  
 -ffpe-trap=invalid,zero,overflow `nf-config --fflags ` "
export CPPFLAGS="-O2 -g `nc-config --cflags ` "
export CFLAGS="-O2 -g `nc-config --cflags ` "
export LDFLAGS=" `nc-config --cflags ` "

Note there is some redundancy in the FLAGS that can be removed.

  1. Run "autoreconf -i" as usual

3a. under dir FRE-NCTools, I make a directory (say build) and then:
3b. cd build
3c. ../configure --prefix=the_install_dir
where "the_install_dir" is where you want to place the tools once they are made.

4 in the build directory, type
4a. make
4b. make install

  1. finally, the users PATH should be set so that the user can find the binaries:
    5a. export PATH=the_install_dir/bin/:$PATH

What I like having a build directory is:
a) if I make a mistake, I can just "rm -rf" the whole build directory and I don't lose much time. Related to this is that sometimes I doubt that "make clean" or "make distclean" works, so once again I can "rm -rf" the whole directory.
b) I can have more than one build directory. Usually I have an build_debug directory and a build_fast directory for when the compiler flags are "-g -O0" and "-O2". But for every directory, within the directory I rerun steps 1 and 3.
Of course step 1. is where the flags are defined, and you would have to add a -D=MAXXGRID=1e10 or whatever you like to the various FLAGS. And of course, I am assuming the three lines are left in create_xgrid.h :

#ifndef MAXXGRID
#define MAXXGRID 1e6
#endif

Finally, when I proceed as above, I never have to manually change the makefiles - the configure program does it all.

Good luck

@sanAkel
Copy link
Author

sanAkel commented Dec 25, 2020

@ngs333 Many thanks for sharing those instructions.
True, the build steps that @mathomp4 passed are a bit different, except for the build dir, which was just like you said.
@mathomp4 is on leave so I can't reconcile those differences (right now), meanwhile I will try your way. 🤞

@ngs333
Copy link
Contributor

ngs333 commented Mar 15, 2021

@sanAkel Can this issue be closed?

@sanAkel
Copy link
Author

sanAkel commented Mar 16, 2021

@sanAkel Can this issue be closed?

@ngs333 Sorry, I need a few days to get reoriented on this. Is that okay?

@ngs333
Copy link
Contributor

ngs333 commented Mar 16, 2021

@sanAkel sure. Take your time

@sanAkel
Copy link
Author

sanAkel commented Jun 14, 2021

Hi @ngs333,
Thanks for your patience. I tried a latest build from @mathomp4 (something from around May, 2021 which already had MAXXGRID 1e6 so I guess that's the default now; if you need, I am sure he can get you exact version/sha):

  1. With make_topog it takes for ever!
  2. So I tried mpirun -np 24 make_topog_parallel ... which was much faster and did work! Yay! Produced a topography file with 1 and 2 arc minute topographies: ETOPO1 and ETOPO2, respectively; it died with GEBCO2020, probably MAXXGRID needs to be further increased! Not sure! I didn't test/try!

With above progress, I guess we can close this issue, since going forward, following a suggestion from @raphaeldussin, I plan to try https://github.com/nikizadehgfdl/ocean_model_topog_generator from @nikizadehgfdl.

Thanks for all the help!

@sanAkel sanAkel closed this as completed Jun 14, 2021
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants