Skip to contents

This function merges multiple NetCDF files from the same climate model, variable, frequency, scenario, and variant into single continuous time series files using CDO (Climate Data Operators). This is essential for creating uninterrupted time series from climate model outputs that are often split across multiple files.

Usage

htr_merge_files(hpc = NA, indir, outdir, year_start, year_end)

Arguments

hpc

Character string or NA. Indicates High Performance Computing mode:

  • NA: Standard processing mode

  • "array": HPC array job mode (requires file parameter)

  • "parallel": HPC parallel mode

indir

Character string. Directory containing merged NetCDF files to be time-sliced. Files should be continuous time series created by htr_merge_files().

outdir

Character string. Directory where time-sliced files will be saved.

year_start

Numeric. Earliest year to include in the merged files. Files ending before this year (for historical data) will be excluded.

year_end

Numeric. Latest year to include in the merged files. Files starting after this year (for projection data) will be excluded.

Value

No return value. The function creates merged time series files in the specified output directory with filenames following the pattern: variable_frequency_model_scenario_variant_merged_YYYYMMDD-YYYYMMDD.nc

Details

Climate model data is typically provided as multiple files covering different time periods. This function combines these files into continuous time series using the CDO mergetime operator, which concatenates files along the time dimension.

The function:

  1. Extracts metadata (variable, frequency, scenario, model, variant) from all files

  2. Groups files by their metadata combinations

  3. Filters files based on the specified year range to avoid out-of-scope data

  4. Merges files for each group using cdo -L -selname,'variable' -mergetime

  5. Creates output filenames with "merged" and the full time range

The CDO command used is: cdo -L -selname,'variable' -mergetime input_files output_file

Where:

  • -L enables netCDF4 compression

  • selname ensures only the specified variable is retained

  • mergetime concatenates files along the time dimension

Note

  • Requires CDO (Climate Data Operators) to be installed and accessible from the system PATH

  • Input files must follow CMIP6 naming conventions for proper metadata extraction

  • Files are only merged if they don't already exist in the output directory

  • Uses parallel processing with (number of CPU cores - 2) workers

  • The -L flag enables netCDF4 compression for smaller output files

  • Automatically handles different time ranges for historical vs. projection scenarios

References

CDO User Guide: https://code.mpimet.mpg.de/projects/cdo/embedded/cdo.pdf CDO mergetime operator: https://code.mpimet.mpg.de/projects/cdo/embedded/cdo.pdf#page=102 CDO selname operator: https://code.mpimet.mpg.de/projects/cdo/embedded/cdo.pdf#page=126

Author

Dave Schoeman and Tin Buenafe

Examples

if (FALSE) { # \dontrun{
# Get a path to a temporary directory
temp_dir <- tempdir()

htr_merge_files(
  hpc = NA,
  indir = system.file("extdata", package = "hotrstuff"), # input directory
  outdir = file.path(temp_dir, "merged"), # output directory
  year_start = 1990, # earliest year across all the scenarios considered
  year_end = 2014 # latest year across all the scenarios considered
)
} # }