-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[PULL REQUEST] Run directory updates enabling automated run directory creation for GCHP and GCClassic #459
Conversation
Oops, I didn't realize that |
Is this needed for 13.0.0? I thought we had decided it would go into 13.1, with updates for both GCClassic and GCHP at the same time. |
It's okay if this goes in after 13.0.0. There isn't a A mechanism like |
Implemented init_rd.sh which enables automated run directory creation. The run directory is initialized based on RDI values loaded from the input files. See #459.
59e85eb
to
c3d47be
Compare
I cleaned up the commit log. |
Resolved conflicts in: run/GCHP/ExtData.rc.templates/ExtData.rc.fullchem run/GCHP/createRunDir.sh run/GCHP/HEMCO_Config.rc.templates/HEMCO_Config.rc.fullchem Signed-off-by: Melissa Sulprizio <mpayer@seas.harvard.edu>
I just updated this branch to 13.1.0-alpha.2 and resolved conflicts. We still need to incorporate these changes into run/GCClassic for consistency between GCClassic and GCHP. @lizziel Please review the latest changes before I bring into @liam Do you have any objections to us removing the |
The files containing environment variables for various run directory settings will ideally be used for both GCHP and GCClassic. Signed-off-by: Melissa Sulprizio <mpayer@seas.harvard.edu>
…r.sh Signed-off-by: Melissa Sulprizio <mpayer@seas.harvard.edu>
run/GCHP/init_rd.sh
Outdated
THIS_SCRIPTS_DIRECTORY=$(realpath $(dirname "$0")) | ||
|
||
if [[ ( $* == --rdi-vars ) ]]; then | ||
grep -roh 'RDI_[A-Z_][A-Z_]*' $THIS_SCRIPTS_DIRECTORY | grep -v 'RDI_VARS' | sort | uniq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the line that prints all the run directory initialization variables
@msulprizio I think a unique prefix is useful for tracking what templating variables exist in the template directories (and for differentiating them from things like shell variables in This way we don't need to keep a list of all the variables that are valid; instead we know it's any variable that start with Would an alternative unique prefix be prefered? What do you think? |
Resolved conflicts in: run/GCClassic/HEMCO_Config.rc.templates/HEMCO_Config.rc.fullchem run/GCClassic/HISTORY.rc.templates/HISTORY.rc.fullchem run/GCClassic/createRunDir.sh run/GCClassic/input.geos.templates/input.geos.fullchem run/GCHP/ExtData.rc.templates/ExtData.rc.TransportTracers run/GCHP/ExtData.rc.templates/ExtData.rc.fullchem run/GCHP/HEMCO_Config.rc.templates/HEMCO_Config.rc.fullchem run/GCHP/createRunDir.sh run/GCHP/input.geos.templates/input.geos.TransportTracers run/GCHP/input.geos.templates/input.geos.fullchem run/GCHP/runConfig.sh.template Signed-off-by: Melissa Sulprizio <mpayer@seas.harvard.edu>
Made the following changes: - Replaced tokens in configuration files with RDI variables - Removed settings files for several simulation types because they contained RDI variables that are set in createRunDir.sh already - Added settings file for GCAP2 (run/shared/settings/modele2.1.txt) - Updated GCHP adjoint configuration file and files for GCHP CO2 simulation so they are consistent with recent updates Signed-off-by: Melissa Sulprizio <mpayer@seas.harvard.edu>
Signed-off-by: Melissa Sulprizio <mpayer@seas.harvard.edu>
…se RDI variables instead Recent modifications for GCAP 2.0 support added several more uses of "sed" in createRunDir.sh. These have been removed where possible and replaced with RDI variables that will be automatically replaced. For HEMCO settings specific to simulations using GMAO or GCAP2 meteorology, the RDI settings have been set in gmao_hemco.txt and gcap2_hemco.txt which are accessed in createRunDir.sh. Similar updates were also made to setupConfigFiles.sh to remove uses of sed from function set_common_settings. That function is still used to manually add species and modify diagnostic output for specific simulation types. Signed-off-by: Melissa Sulprizio <mpayer@seas.harvard.edu>
Run directory variables were previously prefixed with RDI (Run Directory Initialization). It was not immediately obvious what RDI stood for, so we have replaced RDI with the more descriptive RUNDIR. This is a simple swap where all uses of RDI, rdi, or RDI_ have been replaced with RUNDIR, rundir, or RUNDIR_. The run directory variables are now saved to a file named rundir_vars.txt. Signed-off-by: Melissa Sulprizio <mpayer@seas.harvard.edu>
To remove differences in run directory files made with the automated run directory creation updates, the following fixes have been made: 1. Make sure all tokens in HEMCO_Config.rc files are replaced with $RUNDIR_ variables. 2. Remove unused HEMCO settings specific to GCAP 2.0 simulations from the top of HEMCO_Config.rc for several specialty simulations. 3. Make sure to use $GCAP2SCENARIO instead of $SCENARIO for CMIP6 fields consistently throughout HEMCO_Config.rc files. 4. Add dummy HEMCO_Diagn.rc file for tagO3 simulation. 5. In createRunDir.sh set default DustDead tuning factor to -999.0 for resolutions without a recommended scaling. Also make sure to set TOMAS extensions to off when not using that simulation. 6. Fixed a typo for the complexSOA_SVPOA simulation in createRunDir.sh. 7. Make sure to use ${RUNDIR_POP_SPC} instead of {POPs_SPC} in POPs template files. Signed-off-by: Melissa Sulprizio <mpayer@seas.harvard.edu>
…rectory creation updates 1. Fix typo in createRunDir.sh for determining DustDead tuning factor used in 4x5 GEOS-FP benchmark and TOMAS simulations (both of which use online dust emissions). 2. Update commonFunctionsForTests.sh used in Integration Tests to add "NA" to the entries for met fields in the new HEMCO_Config.rc.gmao_metfields file as well as the original HEMCO_Config.rc file. Signed-off-by: Melissa Sulprizio <mpayer@seas.harvard.edu>
…un directory creation updates 1. Fix typos in createRunDir.sh for GCHP. 2. Add RUNDIR_MET_DIR_NATIVE for native-resolution meteorology fields used by GCHP. GCClassic typically uses meteorology fields at the same resolution as the simulation. 3. In commonFunctionsForTests.sh only modify HEMCO_Config.rc.gmao_metfields if the file exists (i.e. only for GCClassic run directories). For GCHP runs, meteorology fields are specified in ExtData.dat and don't need to be modified for nested-grid simulations. Signed-off-by: Melissa Sulprizio <mpayer@seas.harvard.edu>
These updates are now ready for further review and for merging into the standard code. I made extensive changes since @LiamBindle's initial PR, primarily to expand this capability to GCClassic as well as GCHP and to include more run directory variables for fields that differ based on meteorology (GEOS-FP, MERRA-2, GCAP 2.0), grid resolution, and/or simulation type. I also renamed This PR is no longer a zero-diff for all simulation types since it does resolve some issues that existed in the reference version. For example, for marinePOA, acid uptake, and TOMAS simulations, previous simulations used offline emissions by default when online emissions should have been used to properly simulate the emissions for species in those simulations. The PR name has been updated to reflect these changes. These updates pass all GCClassic integration tests:
and GCHP integration tests:
NOTE: There is still room for future work here. Some manipulation of run directory files is still being done in |
@LiamBindle I wasn't able to add you as a reviewer since you originally opened this PR, but please do have a look at the changes I've made on top of your original commits and let me know if you have any questions or concerns. |
@msulprizio Sure thing. I'll review the changes tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@msulprizio Looks good to me. Thanks for spearheading all this!
My only concern is the RUNDIR_VARS
that are generated dynamically in createRunDir.sh
. What do you think of moving those variables to static .txt
files that are included based on the users input?
@@ -23,6 +23,10 @@ cd ${srcrundir} | |||
# Load file with utility functions to setup configuration files | |||
. ${gcdir}/run/shared/setupConfigFiles.sh | |||
|
|||
# Initialize run directory variables | |||
RUNDIR_VARS="" | |||
RUNDIR_VARS+="RUNDIR_GC_MODE='GCClassic'\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does createRunDir.sh
create RUNDIR_VARS
that aren't present in static .txt
files? If so, is it possible to put these in static .txt
files that are then included dynamically?
I just want to clarify because I think it's important that every run directory can be generated from static .txt
files, so that there is the option to totally bypass createRunDir.sh
.
RUNDIR_VARS+="RUNDIR_USE_ONLINE_O3='F'\n" | ||
else | ||
RUNDIR_VARS+="RUNDIR_USE_NLPBL='T'\n" | ||
RUNDIR_VARS+="RUNDIR_USE_ONLINE_O3='T'\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could these be put into static .txt
files that are dynamically included based on $sim_extra_option
?
elif [[ ${scen_num} = "9" ]]; then | ||
scenario="SSP119" | ||
runid="E213SSP119aF40oQ40" | ||
met_avail="# 2040-2049; 2090-2099" | ||
RUNDIR_VARS+="RUNDIR_MET_AVAIL='# 2040-2049; 2090-2099'\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with these
lon_range="-130.0 -60.0" | ||
lat_range=" 9.75 60.0" | ||
RUNDIR_VARS+="RUNDIR_GRID_LON_RANGE='-130.0 -60.0'\n" | ||
RUNDIR_VARS+="RUNDIR_GRID_LAT_RANGE=' 9.75 60.0'\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, above and below.
RUNDIR_VARS+="RUNDIR_SIM_EXTRA_OPTION=$sim_extra_option\n" | ||
|
||
# Determine settings based on simulation type | ||
if [[ ${sim_extra_option} == "benchmark" ]] || \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly to the GC-Classic createRunDir.sh
, could we move these to static .txt
files that are included based on the user input?
Yes, I had originally designed it that way, but found there were many overlapping settings between simulation types which led to duplicate variables in the output |
I was not able to address this point before going on maternity leave. I suggest merging as-is in 13.4.0 to avoid further delay on this PR and incurring additional conflicts. We can create a new feature request for this point. As noted above there is still room for other future work here. Some manipulation of run directory files is still being done in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a straightforward PR, it just involves text replacement in the run-directory scripts and configuration files. OK to merge.
One minor update, instead of using rm -f to free symbolic links, I would use unlink, which is perhaps safer. I can make that change.
The |
I have merged the fSummary of test results:
------------------------------------------------------------------------------
Execution tests passed: 80
Execution tests failed: 0
Execution tests not yet completed: 0
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%% All execution tests passed! %%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% but the GCHP tests all failed. The gchp.log file was not created in each of the run directories. ==============================================================================
GCHP: Execution Test Results
Number of execution tests: 3
==============================================================================
Execution tests:
------------------------------------------------------------------------------
gchp_fullchem_benchmark_merra2_c48...............Execute Simulation.....FAIL
gchp_fullchem_standard_merra2_c24................Execute Simulation.....FAIL
gchp_TransportTracers_geosfp_c24.................Execute Simulation.....FAIL
Summary of execution test results:
------------------------------------------------------------------------------
Execution tests passed: 0
Execution tests failed: 3
Execution tests not completed: 0 Still investigating ... |
Here is the
|
I applied the fix in commit 43f90e7, which fixed the issue. The GCHP integration tests were failing because symbolic links to the various data directories had been removed inadvertently. This is now fixed and the GCHP integration tests all pass: Summary of execution test results:
------------------------------------------------------------------------------
Execution tests passed: 3
Execution tests failed: 0
Execution tests not completed: 0
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%% All execution tests passed! %%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
Hi everyone,
This PR contains updates to run directory creation to facilitate automated run directory creation (and in general, run directory creation from a list of files with variables describing the state to initialize the run directory to). This PR is zero-diff on run directories created with
./createRunDir.sh
, but it does change the underlying way in which./createRunDir.sh
works.In essence,
./createRunDir.sh
used to run a bunch ofsed
commands to tokens in the templates. This PR replaces thissed
-based approach with theenvsubst
-based approach, and moves the copying/templating of required files to a new script calledinit_rd.sh
. This new script has the following usage:In summary,
init_rd.sh
takes a list of files, loads the variables in those files, and then fills the templates (initializing the cwd as the run directory).This PR updates
./createRunDir.sh
to useinit_rd.sh
. Note that./createRunDir.sh
is still used to create all the supplemental files (e.g., runSampleScripts, etc.).Additional Notes
The variables that used to initialize run directory settings used to be quite vague (understandably since they were only used internally in
./createRunDir.sh
). In this PR I've updated the variables to be more descriptive and prefixed withRDI_
(standing for Run Directory Initialization). This has the added benefit of allowing the--rdi-vars
options which lists all the RDI variables found in the templates.The main feature of this PR is
init_rd.sh
. I didn't intend for this script to be used by the public, but I do think it's useful to advanced users. It allows run directories to be created rapidly, precisely, and without the interactive wizard. This facilitates automated run directory creation.You can add the directory containing
init_rd.sh
to your$PATH
. In doing this, you can runinit_rd.sh
from anywhere, and it will resolve the template locations correctly.Example
This example demonstrates creating a benchmark run directory (in a way that could be done by an automated system). First, we need a file with all the RDI variables:
For this example, pretend this file is called
benchmark_rdi_vars.txt
. Then, I can create a benchmark run directory like soThis initializes the cwd as the run directory. Note that the RDI variables can be split across multiple files. Also note that RDI variables can be specified multiple times, in which case the rightmost file has the highest precedence.
The resulting run directory looks like so
Co-benefits
This also make run directory creation more modular. For example, RDI variables for simulation type {fullchem, TransportTracer, benchmark} are split into their own
.txt
files in the newsettings/
subdirectory, and the RDI variables for {MERRA2, GEOS-FP} are also in their own.txt
files. This should ease run directory creation maintanance.Zero-diff tests
Check marks indicate tests that passed.
Simulation types:
Simulation options:
Meteorology source:
Since each of the three categories above control settings that are mutually exclusive, a single test can fulfill multiple
categories (i.e. each one needs to be tested at least once, but combinations don't need to be checked).
Additional tests
Now, when you run
./createRunDir.sh
it will create a file namedrdi_vars.sh
in the run directory. This file containsthe Run Directory Initialization variables that were used to create the run directory. Therefore,
the essential run directory files must be reproducable with the following command:
This test passed (i.e. all the essential run directory files were reproduced identically; checked with
diff
).