Skip to content

Running the system

Matthew Gidden edited this page Oct 5, 2020 · 15 revisions

emissions_downscaling is written entirely in R, and requires an up-to-date version in order to run. The system can be run from the command line using Rscript or directly from an IDE such as RStudio.

Prerequisites

The system requires several additional R packages and a set of IAM emissions data files.

These can be installed using the helper script at the top level of the repo (as below). For more details, see subsequent sections.

$ Rscript install_deps.R

R Package Dependencies

The following R packages are needed for running the system:

libs <- c( "ggplot2", "lubridate", "plyr", "dplyr", "stringr", "readxl", "zoo", "tidyr", "ncdf4", "sp", "geosphere" )

Run this code to install any missing required packages

new.packages <- libs[!(libs %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)

Data requirements

Make sure to place the IAM emissions data in the input/IAM_emissions/ directory, and the CEDS historical emissions in input/reference_emissions/CEDS_CMIP6_to_2015. For the gridding to work, you will also need to add mask and proxy files to the appropriate folders.

The gridding masks and data for the 2015 starting point that are needed to run the system, as well as a sample model input emissions data set can be found on Zenodo. DOI

Memory requirements

Producing spatial grids for multiple years and sectors requires a significant amount of memory. It is recommended to have at least 32-64 GB of RAM available before running the system.

Running from the command line

The downscaling and gridding module can be run using the launchpad script launch_downscaling_gridding.R stored in emissions_downscaling/exe/launchpad folder. The launchpad script contains environment settings for each downscaling/gridding run and will source downscaling scripts and gridding scripts in correct order to produce desired outputs.

You can invoke this script through the command line, provided you supply several required options. (Note that the script requires that your current working directory is either the root directory or the input/ directory.) In general, your command should follow this format:

Rscript --nosave --no-restore exe/launchpad/launch_downscaling_gridding.R <model_name> <harmonization_type> <path/to/input_file_name.xlsx> <module-B_output> <module-C_output> <gridding_flag> <run_species>

All paths should be relative to the input/ directory. For example:

Rscript --no-save --no-restore exe/launchpad/launch_downscaling_gridding.R GCAM4 Harmonized-DB IAM_emissions/GCAM4_SSP4-34//output_harmonized.xlsx ../final-output/module-B/ ../final-output/module-C/ gridding BC

Script parameters

Parameter Options Description
model_name GCAM4, REMIND-MAGPIE, MESSAGE-GLOBIOM, IMAGE, AIM Any new model names must be added to input/mappings/master_config.csv
harmonization_type Harmonized, Harmonized-DB, Unharmonized The user is responsible for making sure the input data is of the type specified here. You will encounter an error in the first processing script of module-B if you indicate the wrong harmonization type.
input_file Path to a valid .xlsx file The input xlsx file should be held in input/IAM_emissions/. See [Input File Format]
module-B_output Desired path for module-B output module-B/ contains downscaling output. Typically this parameter will be final-output/module-B
module-C_output Desired path for module-C output module-C/ contains gridding output. Typically this parameter will be final-output/module-C
gridding_flag gridding or FALSE The gridding_flag controls whether gridding module will be run or not. Note that production of gridded output requires more memory and time than downscaling. Use any value besides gridding if you do not wish to produce gridded output.
run_species Optional: An emissions species such as BC, CH4, CO2, CO, NH3, NOx, OC, SO2, VOC Additionally accepts any NMVOC ID (for anthropogenic emissions), or any NMVOC name (for open burning emissions). Only the selected emissions species will be run.

Note: The parameter twoGrowthRates is no longer supported as a command line argument.

  • twoGrowthRates is a logical (TRUE \ FALSE) that controls whether the routine applies two growth rates to capture mid-century increases in regional emissions. Methodology detailed here.

Further system settings, for example handling NMVOC speciation, can be set in the global_settings.R file.

Running from RStudio

You can also run the launchpad script line-by-line for debugging purposes in RStudio. Make sure to set the em_gridding_env$debug variable to TRUE (found in code/parameters/global_settings.R) and store the arguments usually provided via command line in the variable args_from_makefile.