Skip to contents

Introduction

The aim of this tutorial is to provide a short introduction to the hotrstuff R package. It is also intended to help users familiarize themselves with how Earth System Models (ESM) outputs from the Coupled Model Intercomparison Project (CMIP) are structured. hotrstuff helps with downloading, wrangling, and processing ESM outputs, to make more conscious decisions when dealing with ESM outputs. Throughout this tutorial, we will also talk about some best practices when handling ESM outputs (Schoeman et al. (2023)).

Example data

To illustrate the functionality of the different functions, we are primarily going to use surface temperature data (tos is surface temperature’s official code) from three different ESMs (…………) across three different emission scenarios (SSP1-2.6, SSP2-4.5, SSP3-7.0). ESM outputs can be downloaded through the Earth System Grid Federation. To learn more about the different CMIP6 variables and vocabulary, read documentation on CMIP.

# load packages
library(hotrstuff)
library(terra)

# define variable
var <- "tos"

Downloading ESM outputs

To download ESMs, you’d still have to download the wget scripts from your chosen CMIP6 repository. I would suggest using: https://aims2.llnl.gov/search.

If you want to follow along, use the wget scripts that are in the data/raw/wget folder. But, you can use other wget scripts; just make sure to change the arguments.

Then, run the download_ESM function to run all the wget scripts found in that folder.

# base_dir <- "." # Just start relative to this location
# 
# htr_download_ESM(
#   indir = file.path(base_dir, "data", "raw", "wget"), # input directory
#   outdir = file.path(base_dir, "data", "raw", variable) # output directory
# )

Shift years

Most CMIP models have standardized naming systems. However, some OMIP model outputs have differences in reporting the temporal aspect of their simulations. OMIP runs are 6 cycles of simulations using the 61-year JRA55-do forcing data set (1958-2018), resulting in 366-year model outputs. So, we need to shift the years 1652 years forward for some of the models (e.g., starting at 0001) to standardize across models.

# htr_shift_years(indir = file.path(base_dir,"data", "raw", "omip", variable),
#                 outdir = file.path(base_dir, "data", "proc", "shifted", "omip", variable),
#                 adjust_value = 1652)

Merging files

We now merge files by model-variable-scenario-frequency combination.

# htr_merge_files(
#   indir = file.path(base_dir, "data", "raw", variable), # input directory
#   outdir = file.path(base_dir, "data", "proc", "merged", variable), # output directory
#   year_start = 1956, # earliest year across all the scenarios considered (e.g., historical, ssp126, ssp245, ssp585)
#   year_end = 1981 # latest year across all the scenarios considered
# )

So this would result in just 1 .nc file for each model-variable-scenario-frequency combination.

Adjust and reframe time periods

We now want to only keep the years we’re interested in. For example, we want to make sure that for future projections, we only want to look at 2020-2100.

# htr_slice_period(
#   indir = file.path(base_dir, "data", "proc", "merged", variable), # input directory
#   outdir = file.path(base_dir, "data", "proc", "sliced", variable), # output directory
#   freq = "Omon", # ocean, daily
#   scenario = "ssp",
#   year_start = 2020,
#   year_end = 2100,
#   overwrite = FALSE
# )

Fix calendar periods (if needed)

This function fixes the calendar days, for days with leap years. This is to standardize the calendar across the different models.

# htr_fix_calendar(indir = file.path(base_dir, "data", "proc", "sliced", variable)) # will be rewritten

Changing frequency of climate data

I created two functions: i) changes frequency to monthly (monthly_frequency()), and ii) changes frequency to yearly (yearly_frequency()), but because I’ve started with monthly data here, I didn’t run monthly_frequency() here (but it works).

# htr_change_freq(
#   freq = "monthly",
#   indir = file.path(base_dir, "data", "proc", "sliced", variable), # input directory
#   outdir = file.path(base_dir, "data", "proc", "yearly", variable)
# )

Uncomment the code above if you’re using daily data and changing it to a monthly frequency.

# htr_change_freq(
#   freq = "yearly",
#   indir = file.path(base_dir, "data", "proc", "sliced", variable), # input directory
#   outdir = file.path(base_dir, "data", "proc", "yearly", variable)
# )

This should create a new file that has a yearly frequency. For the rest of the analyses, I’ll only continue with the yearly data.

Regridding

The climate models have different grids, so now we need to regrid and standardize the grid. This requires a base, empty raster that will be made automatically in the regrid_esm().

# htr_regrid_esm(
#   indir = file.path(base_dir, "data", "proc", "yearly", variable),
#   outdir = file.path(base_dir, "data", "proc", "regridded", "yearly", variable),
#   cell_res = 0.25,
#   layer = "annual"
# )

Let’s try to plot this. Just plotting for the first time point and for the two models.

# getwd()
# models <- list.files(file.path(base_dir, "data", "proc", "regridded", "yearly", variable), full.names = TRUE)
# model1 <- terra::rast(models[1])
# plot(model1$tos_1)
# 
# model2 <- rast(models[2])
# plot(model2$tos_1)

Create ensemble

For now, we only have 2 models, but if you have more models, you just need to input a list of the model names as a vector in the model_list argument.

# htr_create_ensemble(
#   indir = file.path(base_dir, "data", "proc", "regridded", "yearly", variable), # input directory
#   outdir = file.path(base_dir, "data", "proc", "ensemble", "mean", variable), # output directory
#   model_list = c("FGOALS-f3-L", "CanESM5"), # list of models for ensemble
#   variable = variable, # variable name
#   freq = "Omon", # original frequency of data
#   scenario = "omip2", # scenario
#   mean = TRUE # if false, takes the median
# )

Then, we plot to see how it looks.

# ensemble_model <- list.files(file.path(base_dir, "data", "proc", "ensemble", "mean", variable), full.names = TRUE)
# ensemble <- rast(ensemble_model)
# plot(ensemble$tos_1)

Create baseline anomalies

References

Schoeman, D. S., A. S. Gupta, C. S. Harrison, J. D. Everett, I. Brito-Morales, L. Hannah, L. Bopp, P. R. Roehrdanz, and A. J. Richardson. 2023. “Demystifying Global Climate Models for Use in the Life Sciences.” Trends in Ecology and Evolution 38 (9): P843–858. https://doi.org/10.1016/j.tree.2023.04.005.