Running your experiments and analyzing your results

Overview

Teaching: 0 min
Exercises: 0 min

Questions

How to submit your short run?

How to continue your run

Objectives

Be able to do a quick run to test an experiment

If successful resubmit the same experiment for a longer period

Analyze the outputs from your experiment

Prepare graphs using python

Running your experiment

Now you are ready to submit your simulation on Saga.

On Saga:

cd $HOME/cases/F2000climo-f19_g17.$EXPNAME

./case.submit

To monitor your job:

squeue -u $USER

         JOBID       PARTITION  NAME      USER      ST     TIME       NODES NODELIST(REASON)
         26243157    normal     F2000cl  bjorngli  R      18:25      7     c14-[3,6,10,13-14],c16-[8,22]

If you realize after having submitted your job that you forgot something (so that it is not worth wasting CPU time) you can always delete your job using the JOBID obtained with the previous squeue -u $USER command (in this example 26243157).

scancel 26243157

If your simulation is unsuccessful you have to understand what happened!

There are in particular log files in the run directory (/cluster/work/users/$USER/cesm/F2000climo-f19_g17.$EXPNAME/run/) which can provide some clues, although the error messages are not always explicit…

Open the latest log file with your favorit text editor (vi, emacs, etc.) and try to search for keywords like “ERROR” or “Error” or “error” (remember that the search is case sensitive).

Then correct any identified bug.

If your short simulation has finished without crashing, check the outputs: were your changes taken into account? Do you get significant results?

Model timing data

A summary timing output file is produced after every CESM run. On Saga and in our case this file is placed in /cluster/work/users/$USER/archive/F2000climo-f19_g17.$EXPNAME/logs and is nammed cpl.log.$date.gz (where $date is a datestamp set by CESM at runtime).

This file contains information which is useful for load balancing a case (i.e., to optimize the processor layout for a given model configuration, compset, grid, etc. such that the cost and throughput will be optimal).

For this lesson we will concentrate on the last few lines in the file and in particular the number of simulated years per computational day, which will help us evaluate the wallclock time required for long runs.

On Saga:

vi cpl.log.190205-144355.gz

.......................
(seq_mct_drv): ===============       SUCCESSFUL TERMINATION OF CPL7-CCSM ===============
(seq_mct_drv): ===============        at YMD,TOD =    90201       0      ===============
(seq_mct_drv): ===============  # simulated days (this run) =    31.000  ===============
(seq_mct_drv): ===============  compute time (hrs)          =     0.347  ===============
(seq_mct_drv): ===============  # simulated years / cmp-day =     5.873  ===============
(seq_mct_drv): ===============  pes min memory highwater  (MB)   50.429  ===============
(seq_mct_drv): ===============  pes max memory highwater  (MB)  517.162  ===============
(seq_mct_drv): ===============  pes min memory last usage (MB)   -0.001  ===============
(seq_mct_drv): ===============  pes max memory last usage (MB)   -0.001  ===============

Here the throughput was 5.873 simulated years / cmp-day and it took 0.347 * 60 ~ 21 minutes to run the first month. Assuming that the other months will take approximately the same time, that represents about 3 months per hour and a bit more than 4 hours for 12 months.

Long experiment (14 months)

As for the previous exercice, you will work in pairs for this practical and you will analyze the model outputs in pairs.
You will be using your previous experiment $HOME/cases/F2000climo-f19_g17.$EXPNAME (EXPNAME should be set depending on your experiment!) and run 14 months.

Set a new duration for your experiment

Make sure you set the duration of your experiment properly. Here we wish to run 14 months from the control restart experiment but as it is a long run, we would rather continue to split it into chuncks of 1 month.

Note that splitting an experiment into small chunks is good practice: this way if something happens and the experiment crashes (disk quota exceeded, hardware issue, etc.) everything will not be lost and it will be possible to resume the run from the latest set of restart files.

On Saga:

# Set EXPNAME properly

cd $HOME/cases/F2000climo-f19_g17.$EXPNAME

Since we have already the first month done, we are going to continue the experiment instead of starting from scratch.

On Saga:

./xmlchange CONTINUE_RUN=TRUE

To perform a 14 months experiment, we would need to repeat this one month experiment 13 times.

For this purpose there is a CESM option called RESUBMIT.

On Saga:

./xmlchange RESUBMIT=13

By setting this option, CAM6 will be running one month of simulation (once submitted) and automatically resubmit the next 12 months.

On Saga:

cd $HOME/cases/F2000climo-f19_g17.$EXPNAME

./case.submit

updating WALLCLOCK TIME

Remember that you can update the job wallclock time:
./xmlchange --subgroup case.run JOB_WALLCLOCK_TIME=01:00:00
Make sure you set the job wallclock time before submitting your case (./case.submit)

Regularly check your experiment (and any generated output files) and once it is fully done, store your model outputs on NIRD.

Store model outputs on NIRD

First make sure that your run was successful and check all the necessary output files were generated.

To post-process and visualize your model outputs, it is VERY IMPORTANT you move them from Saga to NIRD. Remember that all model outputs are generated in a semi-temporary directory and all your files will be removed after a few weeks!

If you haven’t set-up your SSH keys, the next commands (ssh and rsync) will require you to enter your Unix password.

Make sure you define EXPNAME properly (it depends on your experiment).

On Saga:

# If you are running CO2 experiment (otherwise adjust: sea_ice, SST, rocky)
export EXPNAME=CO2

Then copy the archived files from Saga to the NIRD project area.

It is sometimes sensible to also copy the run files and even the case directory, but that should not be necessary for this lesson.

On Saga:

ssh login.nird.sigma2.no 'mkdir -p /projects/NS1000K/climate/GEO4962/outputs/$USER/archive'

rsync -avz /cluster/work/users/$USER/archive/F2000climo-f19_g17.$EXPNAME $USER@login.nird.sigma2.no:/projects/NS1000K/climate/GEO4962/outputs/$USER/archive/.

Once the previous commands are successful, you are ready to post-process and visualize your data on http://climate.uiogeo-apps.sigma2.no/.

However, as your simulation is stored on the NIRD project area, you can now archive your experiment on the NIRD archive (long-term archive i.e. several years).

Post processing and visualization

You can always compare the results of your experiments to the control run, at any time (i.e., this applies for both the short and long runs).

An easy way to do this is to calculate the difference between for example the surface temperature field issued from the control run and that from your new experiment.

Visualization with xarray

On jupyterhub:

Start a new pangeo notebook on your JupyterHub (in this example we assume we have the first month of data from the 4xCO2 experiment).

import os
import xarray as xr
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
%matplotlib inline

experiment = 'F2000climo.f19_g17'
month = '0010-01'
username = # your username on NIRD here

path = 'shared-ns1000k/GEO4962/outputs/runs/F2000climo.f19_g17.control/atm/hist/'
filename = path + 'F2000climo.f19_g17.control.cam.h0.' + month + '.nc'
dsc = xr.open_dataset(filename, decode_cf=False)
TSc = dsc['TS'][0] # the [0] is necessary because the two datasets have different time indices

path = 'shared-ns1000k/GEO4962/outputs/' + username + '/archive/F2000climo.f19_g17.CO2/atm/hist/'
filename = path + 'F2000climo.f19_g17.CO2.cam.h0.' + month + '.nc'
dsco2 = xr.open_dataset(filename, decode_cf=False)
TSco2 = dsco2['TS'][0]

diff = TSco2 - TSc

fig = plt.figure()
ax = plt.axes(projection=ccrs.Miller())

diff.plot(ax=ax, 
           transform=ccrs.PlateCarree(),
           cmap=load_cmap('vik') 
           )

ax.coastlines()
plt.title('Surface temperature [K]\n' + experiment + '-' + month + '\n4xCO2 - Control')

Making bespoke graphs with python

Let’s make a basic contour plot with python.

On jupyterLab:

Now we can make a contour plot with a single command.

On jupyter:

TSco2.plot.contourf()

to obtain this:

This figure is not very useful: we do not know which projection was used, there is no coastline, we would rather have a proper title, etc.

To do that we need to add bit more information.

On jupyter:

import matplotlib.pyplot as plt
import cartopy.crs as ccrs
ax = plt.axes(projection=ccrs.PlateCarree(central_longitude=180))
TSco2.plot.contourf(ax=ax,
                   transform=ccrs.PlateCarree())
ax.set_title(experiment + '-' + month + '\n' + TSco2.long_name)
ax.coastlines()
ax.gridlines()

This is a slightly better plot …

Change the default projection

It is very often convenient to visualize using a different projection than the original data:

On jupyter:

TSmin = 220
TSmax = 320

fig = plt.figure(figsize=[8, 8])
ax = fig.add_subplot(1, 1, 1, 
                      projection=ccrs.Orthographic(central_longitude=20, central_latitude=40))  

TSco2.plot.contourf(ax=ax,
                    transform=ccrs.PlateCarree(), 
                    extend='max',
                    cmap=load_cmap('vik'), 
                    levels=15,
                    vmin=TSmin, vmax = TSmax)

ax.set_title(experiment + '-' + month + '\n')
ax.coastlines()
ax.gridlines()

wrap around longitudes

On jupyter:

# what is longitude min and max?
print(TSco2.lon.min(), TSco2.lon.max())

To fill the gap, we can wrap around longitudes i.e. add a new longitude band at 360. equals to 0.

from cartopy.util import add_cyclic_point

TSmin = 220
TSmax = 320

# max longitude is 357.5 so we add another longitude 360. (= 0.)  
TS_cyclic_co2, lon_cyclic = add_cyclic_point(TSco2.values, coord=TSco2.lon)
# Create a new xarray with the new arrays
TSco2_cy = xr.DataArray(TS_cyclic_co2, coords={'lat':TSco2.lat, 'lon':lon_cyclic}, dims=('lat','lon'), 
                       attrs = TSco2.attrs )

fig = plt.figure(figsize=[8, 8])
ax = fig.add_subplot(1, 1, 1, 
                     projection=ccrs.Orthographic(central_longitude=20, central_latitude=40))

TSco2_cy.plot.contourf(ax=ax,
                      transform=ccrs.PlateCarree(), 
                      extend='max',
                      cmap=load_cmap('vik'), 
                      levels=15,
                      vmin=TSmin, vmax = TSmax)

ax.set_title(experiment + '-' + month + '\n')
ax.coastlines()
ax.gridlines()

You can now use the command savefig to save the current figure into a file.

On jupyter:

fig.savefig(experiment + '-' + month + '.png')

contourf versus pcolormesh

So far, we used contourf to visualize our data but we can also use pcolormesh.

Change contourf by pcolormesh

Change contourf by pcolormesh in the previous plot.

What do you observe?

Solution

import os
import xarray as xr
import numpy as np
import cartopy.crs as ccrs
from cartopy.util import add_cyclic_point
import matplotlib.pyplot as plt

%matplotlib inline


experiment = 'F2000climo.f19_g17'
month = '0010-01'
username = 'herfugl' # you NIRD username here

path = 'shared-ns1000k/GEO4962/outputs/' + username + '/archive/F2000climo.f19_g17.CO2/atm/hist/'
filename = path + 'F2000climo.f19_g17.CO2.cam.h0.' + month + '.nc'

dsco2 = xr.open_dataset(filename, decode_cf=False)
TSco2 = dsco2['TS'][0]

TSmin = 220
TSmax = 320

# max longitude is 356.25 so we add another longitude 360. (= 0.)  
TS_cyclic_co2, lon_cyclic = add_cyclic_point(TSco2.values, coord=TSco2.lon)
# Create a new xarray with the new arrays
TSco2_cy = xr.DataArray(TS_cyclic_co2, coords={'lat':TSco2.lat, 'lon':lon_cyclic}, dims=('lat','lon'), 
                       attrs = TSco2.attrs )

fig = plt.figure(figsize=[8, 8])
ax = fig.add_subplot(1, 1, 1, 
                     projection=ccrs.Orthographic(central_longitude=20, central_latitude=40))

TSco2_cy.plot.pcolormesh(ax=ax,
                      transform=ccrs.PlateCarree(), 
                      extend='max',
                      cmap=load_cmap('vik'), 
                      vmin=TSmin, vmax = TSmax)

ax.set_title(experiment + '-' + month + '\n')
ax.coastlines()
ax.gridlines()

Create multiple plots in the same figure

See here.

On jupyter:

import os
import xarray as xr
import numpy as np
import cartopy.crs as ccrs
from cartopy.util import add_cyclic_point
import matplotlib.pyplot as plt

%matplotlib inline

experiment = 'F2000climo.f19_g17'
username = # your NIRD username here

path = 'shared-ns1000k/GEO4962/outputs/' + username + '/archive/F2000climo.f19_g17.CO2/atm/hist/'


fig = plt.figure(figsize=[20, 8])
TSmin = 220
TSmax = 320

for month in range(1,4):
    filename = path + 'F2000climo.f19_g17.CO2.cam.h0.0010-0' + str(month) + '.nc'
    dset = xr.open_dataset(filename, decode_cf=False)
    TSco2 = dset['TS'][0]
    lat = dset['lat']
    lon = dset['lon']
    dset.close()

    TS_cyclic_si, lon_cyclic = add_cyclic_point(TSco2.values, coord=TSco2.lon)
    TSco2_cy = xr.DataArray(TS_cyclic_si, coords={'lat':TSco2.lat, 'lon':lon_cyclic}, dims=('lat','lon'), 
                            attrs = TSco2.attrs )

    ax = fig.add_subplot(1, 3, month, projection=ccrs.Mollweide())  # specify (nrows, ncols, axnum)

    cs = TSco2_cy.plot.contourf(ax=ax,
                                transform=ccrs.PlateCarree(), 
                                extend='max',
                                cmap=load_cmap('vik'), 
                                vmin=TSmin, vmax = TSmax, 
                                levels=15,
                                add_colorbar=False)

    ax.set_title( 'month ' + str(month) + '\n')
    ax.coastlines()
    ax.gridlines()
    

fig.suptitle(experiment + '-0010'+'\n' + TSco2.long_name, fontsize=24)
    
# adjust subplots so we keep a bit of space on the right for the colorbar    
fig.subplots_adjust(right=0.8)
# Specify where to place the colorbar
cbar_ax = fig.add_axes([0.12, 0.28, 0.72, 0.03])

# Add a unique colorbar to the figure
fig.colorbar(cs, cax=cbar_ax, label=TSco2.units, 
             orientation='horizontal')

Key Points

Start an experiment from a spinup

Continue an existing experiment

Evaluate the CPU time required for a long run

Make custom plots

previous episode

GEO4962: The General Circulation of the Atmosphere

next episode

Running your experiments and analyzing your results

Overview