Having Fun with CMOR#

pyku provides functions to CMORize data, automatically handling a significant number of attributes and unit conversions. However, global attributes must be manually set, as they cannot be inferred. The first step, as usual, is to load the xarray library and the pyku extension.

[1]:
import xarray as xr
import pyku

DRS definition#

pyku includes DRS definition files. You can display the full list of implemented CMOR and CMOR-like DRS definitions as follows:

[2]:
pyku.list_drs_standards()
[2]:
['cmip6',
 'cmip5',
 'cordex',
 'cordex_adjust',
 'cordex_interp',
 'cordex_adjust_interp',
 'cordex_cmip6',
 'hyras',
 'sk_raw',
 'sk_cmor',
 'reanalysis',
 'cosmo-rea6',
 'eobs',
 'obs4mips',
 'seamless']

In this tutorial, we will CMORize COSMO-CLM data using the CORDEX CMOR standard. Behind the scenes, pyku utilizes YAML metadata files, which are structured like this:

cordex:

    # References
    # ----------

    # http://is-enes-data.github.io/
    # http://is-enes-data.github.io/cordex_archive_specifications.pdf
    # http://is-enes-data.github.io/CORDEX_variables_requirement_table.pdf

    # The stem/parent wording stems from the pathlib library
    # https://docs.python.org/3/library/pathlib.html

    metadata:
        - CORDEX_domain
        - driving_model_id
        - driving_experiment_name
        - driving_model_ensemble_member
        - model_id
        - rcm_version_id
        - frequency
        - product
        - institute_id
        - experiment_id

    stem_pattern:
        "{variable_name}_{CORDEX_domain}_{driving_model_id}_\
         {driving_experiment_name}_{driving_model_ensemble_member}_\
         {model_id}_{rcm_version_id}_{frequency}_{start_time}-{end_time}"

    parent_pattern:
        "{product}/{CORDEX_domain}/{institute_id}/{driving_model_id}/\
         {experiment_id}/{driving_model_ensemble_member}/{model_id}/\
         {rcm_version_id}/{frequency}"

Get exemplary ICON data#

Get ICON exemplary data, which consists of raw model output without any post-processing.

[3]:
# List files using UNIX file pattern
# ----------------------------------

grib_files = pyku.resources.get_test_data('icon_grib_files')
grid_file = pyku.resources.get_test_data('icon_grid_file')
/home/runner/work/pyku/pyku/.venv/lib/python3.12/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

Exploring the dataset#

We open the dataset using standard xarray options. It is important to specify how xarray should handle variables when working with multiple files. By default, georeferencing metadata (like the CRS attribute) is tied to the time dimension. Here, we instruct xarray to treat the CRS as a single variable across all files, independent of dataset dimensions, if it remains the same. Additionally, for this demonstration, we only use a subset of the data to ensure faster execution. In the background, both xarray and pyku are optimized for handling large datasets, allowing terabytes of data to be lazily loaded and processed on demand.

[4]:
ds = xr.open_mfdataset(
    grib_files,
    engine='cfgrib',
    combine='nested',
    compat='no_conflicts',
    coords='different',
    time_dims=["valid_time"]
)

ds
ecCodes provides no latitudes/longitudes for gridType='unstructured_grid'
[4]:
<xarray.Dataset> Size: 2GB
Dimensions:            (valid_time: 744, values: 659156)
Coordinates:
  * valid_time         (valid_time) datetime64[ns] 6kB 2025-01-01 ... 2025-01...
    heightAboveGround  float64 8B ...
Dimensions without coordinates: values
Data variables:
    t2m                (valid_time, values) float32 2GB dask.array<chunksize=(744, 659156), meta=np.ndarray>
Attributes:
    GRIB_edition:            2
    GRIB_centre:             edzw
    GRIB_centreDescription:  Offenbach
    GRIB_subCentre:          255
    Conventions:             CF-1.7
    institution:             Offenbach
    history:                 2026-03-20T07:20 GRIB to CDM+CF via cfgrib-0.9.1...

Modify the data#

This preprocessing step standardizes the raw GRIB data for compatibility with pyku. Specifically, we rename the valid_time dimension to time and drop the original model-run initialization time and the forecast time step. Since the dataset is strictly 2D, the heightAboveGround coordinate is removed to simplify the data structure. Finally, we rename the data variable to cell, which is a prerequisite for correctly mapping the geographic coordinate system in subsequent steps.

[5]:
ds = ds.drop_vars(['heightAboveGround'])
ds = ds.rename({'valid_time': 'time'})
ds = ds.rename_dims({'values': 'cell'})

Attach the georeferencing#

ICON model output is typically stored on an unstructured grid, where meteorological variables are separated from their spatial coordinates. To perform georeferencing, we must retrieve a standalone grid file (external coordinates) containing the latitude and longitude for each cell center. This grid information is essential for mapping the raw data to a geographic coordinate system.

[6]:
grid = xr.open_dataset(
    pyku.resources.get_test_data('icon_grid_file')
)

We can merge the grid cell center latitudes and longitudes to the data

[7]:
ds = xr.merge([
    ds,
    grid[['clon', 'clat']]
])

Regridding the data#

With the coordinates attached, we can now apply one of pyku’s pre-defined geographic projections. Note that pyku does not currently support spatial projections for dask-chunked data on unstructured grids; datasets must be loaded into memory before the transformation.

[8]:
ds = ds.chunk(chunks=-1).pyku.project('GER-0275')
2026-03-20 07:20:55,768 - pyku - geo - 822- WARNING - longitude unit conversion from radians to degrees
2026-03-20 07:20:55,770 - pyku - geo - 826- WARNING - latitude unit conversion from radians to degrees
[9]:
ds.isel(time=0).ana.one_map(var='t2m', crs='GER-0275')
<Figure size 640x480 with 0 Axes>
../_images/tutorials_CMORization_17_1.png

Selecting variables for CMORization#

pyku can select a single variable as a dataset, with all global attributes kept, and perform the CMORization process. It is important to note that CMORization should be done one variable at a time. To list all available georeferenced data that can be CMORized (excluding non-data variables like the CRS or time bounds), you can use the following pyku functionality.

[10]:
ds.pyku.get_geodata_varnames()
[10]:
['t2m']

From this list, you can select a specific variable while preserving all its attributes. Special variables, such as the CRS and time bounds, are automatically handled.

Setting CMOR attibutes and names#

The dataset is now ready for CMORization. Note that all variable attributes and conventions are applied, except for the global metadata, which still need to be set.

[11]:
cmorized = ds.pyku.get_geodataset('t2m').pyku.cmorize()
cmorized
[11]:
<xarray.Dataset> Size: 525MB
Dimensions:       (time: 744, rlat: 423, rlon: 415)
Coordinates:
  * time          (time) datetime64[ns] 6kB 2025-01-01 ... 2025-01-31T23:00:00
  * rlat          (rlat) float64 3kB 7.439 7.411 7.384 ... -4.111 -4.139 -4.166
  * rlon          (rlon) float64 3kB -10.29 -10.27 -10.24 ... 1.036 1.064 1.091
    lat           (rlat, rlon) float64 1MB 56.87 56.88 56.88 ... 46.57 46.57
    lon           (rlat, rlon) float64 1MB -0.9172 -0.869 ... 19.54 19.58
Data variables:
    tas           (time, rlat, rlon) float32 522MB dask.array<chunksize=(744, 423, 415), meta=np.ndarray>
    rotated_pole  int32 4B 1
Attributes:
    GRIB_edition:            2
    GRIB_centre:             edzw
    GRIB_centreDescription:  Offenbach
    GRIB_subCentre:          255
    Conventions:             CF-1.7
    institution:             Offenbach
    history:                 2026-03-20T07:20 GRIB to CDM+CF via cfgrib-0.9.1...
    CORDEX_domain:           undefined
    frequency:               1hr

For verification, run a DRS check using the following command:

[12]:
cmorized.pyku.check_drs(standard='cordex')
[12]:
key value issue
0 CORDEX_domain undefined None
1 driving_model_id None missing value
2 driving_experiment_name None missing value
3 driving_model_ensemble_member None missing value
4 model_id None missing value
5 rcm_version_id None missing value
6 frequency 1hr None
7 product None missing value
8 institute_id None missing value
9 experiment_id None missing value

Setting global metadata#

Running a sanity check to verify compliance with the standard, we confirm that the global metadata have not been set, as they cannot be inferred automatically. The global metadata must be manually provided to the cmorize function in the form of a dictionary. For operational use, it is recommended to define this metadata in a YAML file, which can then be loaded into a dictionary

[13]:
global_metadata = {
    'title': (
        "COSMO_CLM_5.00_clm16 simulation (0.0275 Deg) with ERA5T (direct nest, since "
        "2022) forcing, domain Germany, FPS-C tuned setup"
    ),
    'CORDEX_domain': 'GER-0275',
    'driving_model_id': 'ECMWF-ERA5T',
    'driving_experiment_name': 'evaluation',
    'experiment_id': 'evaluation',
    'driving_experiment': 'ECMWF-ERA5T, evaluation, r1i1p1',
    'driving_model_ensemble_member': 'r1i1p1',
    'experiment': 'evaluation run with ECMWF-ERA5T reanalysis forcing',
    'model_id': 'CLMcom-DWD-CCLM5-0-16',
    'institute_id': 'CLMcom-DWD',
    'institution': (
        "DWD (Deutscher Wetterdienst, Offenbach, Germany) in collaboration with the "
        "CLM-Community"
    ),
    'rcm_version_id': 'x0n1-v1',
    'contact': 'klima.offenbach@dwd.de',
    'nesting_levels': '1',
    'comment_nesting': 'these are results of a direct downscaling (one nest) approach',
    'comment_1stNest': (
        "convection permitting simulation driven by ECMWF-ERA5T (direct downscaling) "
        "comment_2ndNest: not used "
    ),
    'comment': (
        "Convection-permitting evaluation run for Germany (HoKliSim-DeT) performed by "
        "DWD Offenbach. Configuration adapted from the CORDEX FPS Convection setup, "
        "which was developed and tuned by the CRCS working group of the CLM Community. "
        "Computing system: Aurora NEC at DWD. The Deutscher Wetterdienst (DWD) is the "
        "producer of the data. The General Terms and Conditions of Business and "
        "Delivery apply for services provided by DWD "
        "(http://www.dwd.de/EN/service/terms/terms.html)."
    ),
    "rcm_config_cclm": "GER-0275_CLMcom-DWD-CCLM5-0-16_config",
    "rcm_config_int2lm": "GER-0275_CLMcom-DWD-INT2LM2-0-6-modif_config",
    "source": "Climate Limited-area Modelling Community (CLM-Community)",
    "references": "http://www.clm-community.eu, http://www.dwd.de",
    "product": "output",
    "project_id": "DWD-CPS",
}

Now, we CMORize the data again, this time setting the global attributes for CMOR compliance. Afterward, we perform another sanity check and observe that the global attributes have been correctly set.

[14]:
cmorized = ds.pyku.get_geodataset('t2m')
cmorized = cmorized.pyku.cmorize(global_metadata=global_metadata)
cmorized.pyku.check_drs(standard='cordex')
[14]:
key value issue
0 CORDEX_domain GER-0275 None
1 driving_model_id ECMWF-ERA5T None
2 driving_experiment_name evaluation None
3 driving_model_ensemble_member r1i1p1 None
4 model_id CLMcom-DWD-CCLM5-0-16 None
5 rcm_version_id x0n1-v1 None
6 frequency 1hr None
7 product output None
8 institute_id CLMcom-DWD None
9 experiment_id evaluation None

An additional sanity check can be performed using the pyku check function to identify any potential issues:

[15]:
cmorized.pyku.check_metadata(standard='cordex')
[15]:
key value issue description
0 y_projection_coordinate_exist True None Checking if y projection coordinate available
1 x_projection_coordinate_exist True None Checking if x projection coordinate available
2 lat_geographic_coordinate_exist True None Checking if lat geographic coordinate available
3 lon_geographic_coordinate_exist True None Checking if lon geographic coordinate available
4 y_projection_coordinate_unit_correct True None Checking if y projection coordinates units
5 x_projection_coordinate_unit_correct True None Checking if x projection coordinates units
6 y_projection_coordinate_standard_name_correct True None Checking y projection coordinates standard_name
7 x_projection_coordinate_standard_name_correct True None Checking x projection coordinates standard_name
8 lat_geographic_coordinate_unit_correct True None Checking lat geographic coordinate units
9 lon_geographic_coordinate_unit_correct True None Checking if lat geographic coordinate units
10 lat_geographic_coordinate_standard_name_correct True None Checking lat geographic coordinate standard_name
11 lon_geographic_coordinate_standard_name_correct True None Checking lon geographic coordinate standard_name
12 cf_area_def_readable True None Check if CF projection metadata are readable
13 area_extent_is_readable True None Check if the area extent can be determined fro...
14 longitudes_within_180W_and_180E True None Check that the longitudes are within 180 degre...
15 tas_units_can_be_read True None Check if units can be read automatically
16 tas tas None NaN
17 is_cmor_standard_name True None If possible, check if standard name is CMOR co...
18 is_cmor_long_name True None Check if long_name is CMOR conform
19 is_cmor_units True None Check if units is CMOR conform
20 frequency_can_be_inferred_from_data True None Tried to infer frequency from the time labels
21 frequency_can_be_determined True None Check if frequency can be determined from the ...
22 geodata_vars [tas] None NaN
23 geographic_latlon (lat, lon) None NaN
24 projection_yx (rlat, rlon) None NaN
25 time_dependent_vars [tas] None NaN
26 time_bounds_var None None NaN
27 spatial_bounds_vars [] None NaN
28 spatial_vertices_vars [] None NaN
29 crs_var rotated_pole None NaN
30 unidentified_vars [] None NaN
31 has_time_dimension True None Check if data have a time dimension
32 time_is_numpy_datetime64_or_cftime True None Check the data type of the time stamps
33 CORDEX_domain GER-0275 None NaN
34 driving_model_id ECMWF-ERA5T None NaN
35 driving_experiment_name evaluation None NaN
36 driving_model_ensemble_member r1i1p1 None NaN
37 model_id CLMcom-DWD-CCLM5-0-16 None NaN
38 rcm_version_id x0n1-v1 None NaN
39 frequency 1hr None NaN
40 product output None NaN
41 institute_id CLMcom-DWD None NaN
42 experiment_id evaluation None NaN

CMORized data filename#

From here, we can verify the name that the final data file should have.

[16]:
cmorized.pyku.drs_filename(standard='cordex')
[16]:
'output/GER-0275/CLMcom-DWD/ECMWF-ERA5T/evaluation/r1i1p1/CLMcom-DWD-CCLM5-0-16/x0n1-v1/1hr/tas/tas_GER-0275_ECMWF-ERA5T_evaluation_r1i1p1_CLMcom-DWD-CCLM5-0-16_x0n1-v1_1hr_2025010100-2025013123.nc'

Geographic projections#

The data can be reprojected as needed, with all available CMOR metadata set automatically. Note how the grid has been modified in the metadata to comply with the CMOR standard.

[17]:
cmorized = ds.pyku.get_geodataset('t2m')
cmorized = cmorized.pyku.cmorize(global_metadata=global_metadata)
cmorized = cmorized.pyku.project('HYR-LAEA-5')
cmorized.pyku.check_drs(standard='cordex')
[17]:
key value issue
0 CORDEX_domain undefined None
1 driving_model_id ECMWF-ERA5T None
2 driving_experiment_name evaluation None
3 driving_model_ensemble_member r1i1p1 None
4 model_id CLMcom-DWD-CCLM5-0-16 None
5 rcm_version_id x0n1-v1 None
6 frequency 1hr None
7 product output None
8 institute_id CLMcom-DWD None
9 experiment_id evaluation None

The filename should now be as follows:

[18]:
cmorized.pyku.drs_filename(standard='cordex')
[18]:
'output/undefined/CLMcom-DWD/ECMWF-ERA5T/evaluation/r1i1p1/CLMcom-DWD-CCLM5-0-16/x0n1-v1/1hr/tas/tas_undefined_ECMWF-ERA5T_evaluation_r1i1p1_CLMcom-DWD-CCLM5-0-16_x0n1-v1_1hr_2025010100-2025013123.nc'

Resampling data#

Similarly, we can transform the data through resampling.

[19]:
cmorized = ds.pyku.get_geodataset('t2m')
cmorized = cmorized.pyku.project('HYR-LAEA-5')
cmorized = cmorized.pyku.resample_datetimes(frequency='1D', how='mean')
cmorized = cmorized.pyku.cmorize(global_metadata=global_metadata)
cmorized
2026-03-20 07:21:09,201 - pyku - timekit - 314- WARNING - t2m: 'cell_methods' was not set automatically
[19]:
<xarray.Dataset> Size: 8MB
Dimensions:    (time: 31, y: 220, x: 258, bnds: 2)
Coordinates:
  * time       (time) datetime64[ns] 248B 2025-01-01T12:00:00 ... 2025-01-31T...
  * y          (y) float64 2kB 3.586e+06 3.582e+06 ... 2.496e+06 2.492e+06
  * x          (x) float64 2kB 3.808e+06 3.814e+06 ... 5.088e+06 5.094e+06
    lat        (y, x) float64 454kB 55.13 55.13 55.14 ... 45.09 45.08 45.08
    lon        (y, x) float64 454kB 1.95 2.028 2.106 2.184 ... 19.7 19.76 19.83
Dimensions without coordinates: bnds
Data variables:
    tas        (time, y, x) float32 7MB dask.array<chunksize=(31, 220, 258), meta=np.ndarray>
    crs        int32 4B 1
    time_bnds  (time, bnds) datetime64[ns] 496B 2025-01-01 ... 2025-02-01
Attributes: (12/30)
    GRIB_edition:                   2
    GRIB_centre:                    edzw
    GRIB_centreDescription:         Offenbach
    GRIB_subCentre:                 255
    Conventions:                    CF-1.7
    institution:                    DWD (Deutscher Wetterdienst, Offenbach, G...
    ...                             ...
    rcm_config_cclm:                GER-0275_CLMcom-DWD-CCLM5-0-16_config
    rcm_config_int2lm:              GER-0275_CLMcom-DWD-INT2LM2-0-6-modif_config
    source:                         Climate Limited-area Modelling Community ...
    references:                     http://www.clm-community.eu, http://www.d...
    product:                        output
    project_id:                     DWD-CPS
[20]:
cmorized.time_bnds
[20]:
<xarray.DataArray 'time_bnds' (time: 31, bnds: 2)> Size: 496B
array([['2025-01-01T00:00:00.000000000', '2025-01-02T00:00:00.000000000'],
       ['2025-01-02T00:00:00.000000000', '2025-01-03T00:00:00.000000000'],
       ['2025-01-03T00:00:00.000000000', '2025-01-04T00:00:00.000000000'],
       ['2025-01-04T00:00:00.000000000', '2025-01-05T00:00:00.000000000'],
       ['2025-01-05T00:00:00.000000000', '2025-01-06T00:00:00.000000000'],
       ['2025-01-06T00:00:00.000000000', '2025-01-07T00:00:00.000000000'],
       ['2025-01-07T00:00:00.000000000', '2025-01-08T00:00:00.000000000'],
       ['2025-01-08T00:00:00.000000000', '2025-01-09T00:00:00.000000000'],
       ['2025-01-09T00:00:00.000000000', '2025-01-10T00:00:00.000000000'],
       ['2025-01-10T00:00:00.000000000', '2025-01-11T00:00:00.000000000'],
       ['2025-01-11T00:00:00.000000000', '2025-01-12T00:00:00.000000000'],
       ['2025-01-12T00:00:00.000000000', '2025-01-13T00:00:00.000000000'],
       ['2025-01-13T00:00:00.000000000', '2025-01-14T00:00:00.000000000'],
       ['2025-01-14T00:00:00.000000000', '2025-01-15T00:00:00.000000000'],
       ['2025-01-15T00:00:00.000000000', '2025-01-16T00:00:00.000000000'],
       ['2025-01-16T00:00:00.000000000', '2025-01-17T00:00:00.000000000'],
       ['2025-01-17T00:00:00.000000000', '2025-01-18T00:00:00.000000000'],
       ['2025-01-18T00:00:00.000000000', '2025-01-19T00:00:00.000000000'],
       ['2025-01-19T00:00:00.000000000', '2025-01-20T00:00:00.000000000'],
       ['2025-01-20T00:00:00.000000000', '2025-01-21T00:00:00.000000000'],
       ['2025-01-21T00:00:00.000000000', '2025-01-22T00:00:00.000000000'],
       ['2025-01-22T00:00:00.000000000', '2025-01-23T00:00:00.000000000'],
       ['2025-01-23T00:00:00.000000000', '2025-01-24T00:00:00.000000000'],
       ['2025-01-24T00:00:00.000000000', '2025-01-25T00:00:00.000000000'],
       ['2025-01-25T00:00:00.000000000', '2025-01-26T00:00:00.000000000'],
       ['2025-01-26T00:00:00.000000000', '2025-01-27T00:00:00.000000000'],
       ['2025-01-27T00:00:00.000000000', '2025-01-28T00:00:00.000000000'],
       ['2025-01-28T00:00:00.000000000', '2025-01-29T00:00:00.000000000'],
       ['2025-01-29T00:00:00.000000000', '2025-01-30T00:00:00.000000000'],
       ['2025-01-30T00:00:00.000000000', '2025-01-31T00:00:00.000000000'],
       ['2025-01-31T00:00:00.000000000', '2025-02-01T00:00:00.000000000']],
      dtype='datetime64[ns]')
Coordinates:
  * time     (time) datetime64[ns] 248B 2025-01-01T12:00:00 ... 2025-01-31T12...
Dimensions without coordinates: bnds

Again, note that the metadata have been set in accordance with the CMOR standard during this operation.

[21]:
cmorized.pyku.check_drs(standard='cordex')
[21]:
key value issue
0 CORDEX_domain GER-0275 None
1 driving_model_id ECMWF-ERA5T None
2 driving_experiment_name evaluation None
3 driving_model_ensemble_member r1i1p1 None
4 model_id CLMcom-DWD-CCLM5-0-16 None
5 rcm_version_id x0n1-v1 None
6 frequency day None
7 product output None
8 institute_id CLMcom-DWD None
9 experiment_id evaluation None

Pay particular attention to how the variable attributes and the cell_methods have been set:

[22]:
cmorized.tas.attrs
[22]:
{'GRIB_paramId': 167,
 'GRIB_dataType': 'fc',
 'GRIB_numberOfPoints': 659156,
 'GRIB_typeOfLevel': 'heightAboveGround',
 'GRIB_stepUnits': 1,
 'GRIB_stepType': 'instant',
 'GRIB_gridType': 'unstructured_grid',
 'GRIB_NV': 0,
 'GRIB_cfName': 'air_temperature',
 'GRIB_cfVarName': 't2m',
 'GRIB_gridDefinitionDescription': 'General unstructured grid',
 'GRIB_missingValue': 3.4028234663852886e+38,
 'GRIB_name': '2 metre temperature',
 'GRIB_shortName': '2t',
 'GRIB_units': 'K',
 'long_name': 'Near-Surface Air Temperature',
 'standard_name': 'air_temperature',
 'grid_mapping': 'crs',
 'units': 'K',
 'cell_methods': 'time: point'}

Writing NetCDF data#

The final step is to write the data, which can be quite large, into packets that conform to the CMOR standard. Specifically, this means:

  • Hourly data will be split into packets of one year.

  • Three-hourly data will be split into packets of one year.

  • Six-hourly data will be split into packets of one year.

  • Twelve-hourly data will be split into packets of five years.

  • Daily data will be split into packets of five years.

  • Monthly data will be split into packets of ten years.

  • Seasonal data will be split into packets of ten years.

  • Yearly data and above will be split into packets of one hundred years.

For this demonstration, we are using only a subset of data to ensure the tutorial runs in a timely manner, rather than showcasing the full capabilities. However, this functionality is built-in and will operate automatically for large datasets without requiring user input. First, we create a temporary directory to store the data. This directory will be deleted at the end of this session.

[23]:
import tempfile

temp_dir = tempfile.TemporaryDirectory()
print("Temporary directory:", temp_dir.name)
Temporary directory: /tmp/tmp1ru5xm7b
[24]:
pyku.drs.to_drs_netcdfs(cmorized, base_dir=temp_dir.name, standard='cordex')
[25]:
!find {temp_dir.name} -type f
/tmp/tmp1ru5xm7b/output/GER-0275/CLMcom-DWD/ECMWF-ERA5T/evaluation/r1i1p1/CLMcom-DWD-CCLM5-0-16/x0n1-v1/day/tas/tas_GER-0275_ECMWF-ERA5T_evaluation_r1i1p1_CLMcom-DWD-CCLM5-0-16_x0n1-v1_day_20250101-20250131.nc
[26]:
file = !find {temp_dir.name} -type f
!ncdump -h {file[0]}
/bin/bash: line 1: ncdump: command not found

And finally cleanup the temporary directory

[27]:
temp_dir.cleanup()