This lesson is still being designed and assembled (Pre-Alpha version)

Analyze and visualize test model outputs with python using Jupyter notebooks

Overview

Teaching: 0 min
Exercises: 0 min
Questions
  • How to start/stop Jupyter notebooks in the Jupyterhub?

  • How to analyze and visualize CESM outputs with python?

  • What is the vertical coordinate in CESM?

Objectives
  • Learn about the Jupyter ecosystem

  • Learn to analyze CESM CAM outputs with the test run

  • Learn about python packages for Earth System Modelling

Post-processing and Visualization

Introduction to Jupyterhub and JupyterLab

You will be using jupyterhub to post-process and visualize your data. All attendees have received a username and corresponding password; please let us know otherwise.

Login to the JupyterHub

Tips

Before getting to the username/password login page, your browser may warn you about the security of the website and get a message such as “This site is not secure”. In order to bypass this security message, you need to accept to proceed. Let us know if you still have problems to login.

Start and stop your server

When you manage to successfully login to the Jupyterhub, you should have the following panels:

To start/stop your server, clock on Control Panel:

When your server is not running, the button “Stop My Server” does not appear and you only see the button “My Server”.

Important note

Make sure you stop your server once you have finished your session so it can release resources for next time.

JupyterLab

By default, you will get the web-based user interface for Project Jupyter that is called JupyterLab.

Remark

Do not worry if you do not have pangeo notebooks. We will see later how to change kernels within JupyterLab.

The menu bar at the top of JupyterLab has top-level menus that expose actions available in JupyterLab with their keyboard shortcuts. The default menus are:

The left sidebar contains a number of commonly-used tabs, such as a file browser, a list of running kernels and terminals, the command palette, and a list of tabs in the main work area:

If you move your mouse on the other icon of this left sidebar, a short information is given on its functionality.

If you click on the “running man” icon, you can see what is currently running on your server and you can click on “SHUTDOWN” to stop a running Python notebook or Terminal.

Start a new Terminal

Similarly, you can start a new Terminal by clicking on “Terminal” in the Launcher.

Tips

If the Launcher tab does not exist anymore in your JupyterLab, you can start a new one in “File –> New Launcher”.

In your terminal (from jupyterhub):

source activate pangeo
python -m ipykernel install --user --name=pangeo

Create a new python 3 notebook

Go back to the File Browser left sidebar tab and in the launcher select Python 3 under the Notebook section:

By default, your new notebook is named as “Untitled.ipynb”:

Changing kernel

All the python packages we need to analyzing and visualizing climate data are not available in the default python 3 kernel.

We would need to switch to pangeo kernel as shown on the figure below.

Getting help in a jupyter notebook

Python (general and available outside jupyter notebook)

In addition to online help, you can use help function:

import matplotlib
help(matplotlib)

The Jupyter Notebook has two ways to get help:

Copy your output files from Saga to JupyterHub

Start a new Terminal on your JupyterHub (this will be referred to hereafter as your “JupyterHub terminal”) and type the following commands.

On the JupyterHub terminal:
# Create a nwe folder for storing mod outputs
mkdir -p $HOME/outputs/runs
# Change directory
cd $HOME/outputs/runs
# Copy model outputs (remote synchronization)
rsync -avzu --progress YOUR_USER_NAME@saga.sigma2.no:/cluster/work/users/YOUR_USER_NAME/archive/F2000climo-f19_g17 .
# Go back to last directory
cd -

Map visualization with xarray

Start a new python3 notebook on your JupyterHub and type the following commands.

On jupyter:
# Python package that makes working with labelled multi-dimensional arrays simple and efficient
import xarray as xr

This set of commands initialize the python 3 notebook with python package (xarray) that we will use for plotting our netCDF model outputs.

Now we can create a map. We plot TS (Surface temperature) by specifying the filename, opening the dataset, and using the xarray.DataArray.plot() function:

%matplotlib inline

# specify the path where your test simulation is stored
path = 'shared-ns1000k/GEO4962/outputs/runs/F2000climo.f19_g17.control/atm/hist/'
filename = path + 'F2000climo.f19_g17.control.cam.h0.0009-01.nc'
print(filename)

# load netcdf file into an xarray dataset
ds = xr.open_dataset(filename, decode_times=False)

# and plot surface temperature
ds.TS.plot()

Remark:

To make a plot from a jupyter notebook, you may need to add:

%matplotlib inline

This line is only valid in a jupyter notebook and cannot be used otherwise.

Customize your maps

Set figure size

import matplotlib as mpl
mpl.rcParams['figure.figsize'] = [10., 8.]
ds.TS.plot()

Use a scientific color map

Using unscientific color maps like the rainbow (a.k.a. jet) color map distorts and hides the underlying data, while often making the figure unreadable to color-blind readers or when printed in black and white. If you want to use a scientific color map (created by Fabio Crameri here at UiO), you can load the function load_cmap.py using the following statment in your Jupyter Notebook:

%run shared-ns1000k/GEO4962/scripts/load_cmap.ipynb

More info about scientific color maps, as well as a list of included color maps here

The function can now be used as the color map argument when you plot:

ds.TS.plot(cmap=load_cmap('vik'))

Plot 4D-fields such as Temperature

In the same way add another cell below the plot and display the variable T instead of the surface temperature (TS). To select which vertical level to plot use the isel() function.

ds.T.isel(lev=20).plot(cmap=load_cmap('vik'))

Contrary to TS which depends only on two horizontal dimensions (namely latitude and longitude) plus time, for T there is an additional vertical dimension (lev).

What did we plot?

  • What is the difference between T and TS?
  • Which time did you plot?
  • Which level did you plot?
  • How to display the lowest model level?

Now, add another cell below the plot and try to display the zonal wind (U) instead of the surface temperature (TS). As for T, U has an additional dimension (along the vertical), hence we also have to specify a vertical level (between 0 and 29) to make our plot.

On jupyter:
ds.U.isel(lev=-1).plot(cmap=load_cmap('broc'))

Change map projection

We can use the Python package Cartopy to produce maps and do other geospatial data analyses. We will also use pyplot, a collection of functions that make plotting simpler.

import cartopy.crs as ccrs
import matplotlib.pyplot as plt

fig = plt.figure()
ax = plt.axes(projection=ccrs.Miller())

ds.TS.plot(ax=ax, 
           transform=ccrs.PlateCarree(),
           cmap=load_cmap('vik') 
          )

ax.coastlines()

Adding lat/lon labels when using projections

As you can see on the figure above, axes labels disappear when you change the map projection. You can add gridlines:

ax.gridlines()

On recent version of cartopy (>= 0.18.0b1), support for lat/lon labelling has been added for all projections.

On older versions of cartopy, PlateCarree and Mercator projections support lat/lon labelling:

import cartopy.crs as ccrs
import matplotlib.pyplot as plt

fig = plt.figure(figsize=(15,10))
ax = plt.axes(projection=ccrs.Mercator())

ds.TS.plot(ax=ax, 
           transform=ccrs.PlateCarree(),
	   cmap=load_cmap('vik') 
	   )

ax.coastlines()
ax.gridlines(color='lightgrey', linestyle='-', draw_labels=True)

# to make sure the title does not oveelap longitude labelling
plt.title("Surface temperature", y=1.08)

The list of available cartopy projections is available here.

Plotting your model outputs

  • Use data from your own experiment F2000climo-f19_g17 to generate maps for various variables such as T, U and any other variables that may be of interest for your analysis. Tips: To read your model outputs, use:
path = 'outputs/runs/F2000climo-f19_g17/atm/hist/' 
filename = path + 'F2000climo-f19_g17.cam.h0.0014-01.nc' 

Georeferenced Latitude-Vertical plot

2D plot for one longitude point

We use xarray’s sel to select the data along longitude 0.


ds.T.sel(lon=0).plot(cmap=load_cmap('vik'))

2D plot over averaged longitudes

Now instead of selecting one longitude, we average over all the longitudes, using the mean function:

ds.T.mean(dim='lon').plot(cmap=load_cmap('vik'))

CESM vertical coordinate system

The vertical coordinate is a hybrid sigma-pressure system.

The hybrid coordinate was developed by Simmons and Strüfing, 1981 in order to provide a general framework for a vertical coordinate which is terrain following at the Earth’s surface, but reduces to a pressure coordinate at some point above the surface.

In this system, the upper regions of the atmosphere are discretized by pressure only. Lower vertical levels use the sigma (i.e. p/ps) vertical coordinate smoothly merged in, with the lowest levels being pure sigma. A schematic representation of the hybrid vertical coordinate and vertical indexing is presented below.

The CESM system is defined such that closer to the earth’s surface, the levels more closely resemble a pure sigma level, while the higher up you go, the more the levels are like pressure levels.

The formula to use to determine the pressure at the edge of the layer K is :

where Ps is the surface pressure and A and B are coefficients defined at each model level and stored in the netCDF model outputs.

Let’s have a look at the values of A and B:

print(ds.hyam)
<xarray.DataArray 'hyam' (lev: 32)>
array([0.003643, 0.007595, 0.014357, 0.024612, 0.035923, 0.043194, 0.051677,
       0.06152 , 0.073751, 0.087821, 0.103317, 0.121547, 0.142994, 0.168225,
       0.178231, 0.170324, 0.161023, 0.15008 , 0.137207, 0.122062, 0.104245,
       0.084979, 0.066502, 0.050197, 0.037189, 0.028432, 0.022209, 0.016407,
       0.011075, 0.006255, 0.001989, 0.      ])
Coordinates:
  * lev      (lev) float64 3.643 7.595 14.36 24.61 ... 936.2 957.5 976.3 992.6
Attributes:
    long_name:  hybrid A coefficient at layer midpoints


print(ds.hybm)
<xarray.DataArray 'hybm' (lev: 32)>
array([0.      , 0.      , 0.      , 0.      , 0.      , 0.      , 0.      ,
       0.      , 0.      , 0.      , 0.      , 0.      , 0.      , 0.      ,
       0.019677, 0.062504, 0.112888, 0.172162, 0.241894, 0.323931, 0.420442,
       0.5248  , 0.624888, 0.713208, 0.78367 , 0.831103, 0.864811, 0.896237,
       0.925124, 0.951231, 0.974336, 0.992556])
Coordinates:
  * lev      (lev) float64 3.643 7.595 14.36 24.61 ... 936.2 957.5 976.3 992.6
Attributes:
    long_name:  hybrid B coefficient at layer midpoints

If we print these coefficients A and B and the level, we can clearly understand why at the top of the atmosphere, we have pure pressure levels (B=0) and pure sigma levels at the bottom (A=0):

print("lev     A          B")
for lev, a, b in zip (range(1,ds.dims['lev']+1), ds.hyam.values, ds.hybm.values):
    print(lev, 1000*a, 1000*b)
lev     A          B
1 3.64346569404006 0.0
2 7.594819646328688 0.0
3 14.356632251292467 0.0
4 24.612220004200935 0.0
5 35.92325001955032 0.0
6 43.1937500834465 0.0
7 51.67749896645546 0.0
8 61.52049824595451 0.0
9 73.75095784664154 0.0
10 87.82123029232025 0.0
11 103.31712663173676 0.0
12 121.54724076390266 0.0
13 142.99403876066208 0.0
14 168.22507977485657 0.0
15 178.2306730747223 19.677413627505302
16 170.32432556152344 62.50429339706898
17 161.02290898561478 112.8879077732563
18 150.08028596639633 172.16161638498306
19 137.20685988664627 241.89404398202896
20 122.06193804740906 323.93063604831696
21 104.24471274018288 420.44246196746826
22 84.97915416955948 524.7995406389236
23 66.50169566273689 624.8877346515656
24 50.19678920507431 713.2076919078827
25 37.188658490777016 783.6697101593018
26 28.431948274374008 831.1028182506561
27 22.20897749066353 864.8112714290619
28 16.40738220885396 896.2371647357941
29 11.074557900428772 925.1238405704498
30 6.2549535650759935 951.230525970459
31 1.9894090946763754 974.3359982967377
32 0.0 992.556095123291

We multiplied A and B by 1000 so that values are in hPa.

So if we compute the pressure at the lowest model level in Oslo:

# Find the surface pressure of the grid cell near Oslo and convert from Pascal to hPa
PS_oslo = ds['PS'].isel(time=0).sel(lat=60., lon=10.75, method='nearest')/100. 

# Convert the lowest model level from sigma to pressure
p_bottom_oslo = ds.hyam[-1] + ds.hybm[-1]*PS_oslo
print("Oslo (hPa)")
print("PS = ", PS_oslo.values, " Lowest model level = ", p_bottom_oslo.values)

Oslo (hPa)
PS =  948.606640625  Lowest model level =  941.545303026773

So the pressure at the lowest model level is not that far from the surface pressure in Oslo.

And what about the Mount Everest?

Compute the pressure value at the lowest model level (close to the surface) for lat=28 and lon=86.5 (Everest top)

Solution

# First find the value of the surface pressure at Everest and convert to hPa
PS_everest = ds['PS'].isel(time=0).sel(lat=28., lon=86.5, method='nearest')/100. 
# Then proceed as for Oslo
print(PS_everest)
<xarray.DataArray 'PS' ()>
array(756.7215625)
Coordinates:
    lat      float64 27.47
    lon      float64 87.5
    time     float64 2.951e+03

And now we compute the pressure at the lowest model level:

p_bottom_everest = ds.hyam[-1] + ds.hybm[-1]*PS_everest
print(p_bottom_everest)
<xarray.DataArray ()>
array(751.08859917)
Coordinates:
    lev      float64 992.6
    lat      float64 27.47
    lon      float64 87.5
    time     float64 2.951e+03

This clearly shows that the lowest level pressure value we have at the top of Mount Everest is 751 hPa, quite different from the corresponding sigma level (992.6).

More information can be found in Description of the NCAR Community Atmosphere Model (CAM 3.0).

Having pressure values (hPa) as the vertical coordinate, it is clear that we need to revert the vertical axis to get the lower values at the top and the highest values at the bottom:

ds.T.mean(dim='lon').plot(cmap=load_cmap('vik'))
plt.ylim(plt.ylim()[::-1])

The vertical axis is labelled as “hybrid level at midpoints”. Again, not pressure levels but still we usually use the log to plot it as it is more intuitive to analyze. For this, go to the tab “Grid” and change the units of the vertical axis from “scalar” to “Log10”.

ds.T.mean(dim='lon').plot(cmap=load_cmap('vik'))
plt.ylim(plt.ylim()[::-1])
plt.yscale('log')

We can also adjust the top of the figure:

ds.T.mean(dim='lon').plot(cmap=load_cmap('vik'))
plt.ylim(plt.ylim()[::-1])
plt.yscale('log')
plt.ylim(top=10)

Georeferenced Latitude-Vertical plot with your model outputs

  • Use data from your own experiment F2000climo-f19_g17 to generate georeferenced Latitude-Vertical plot for U and T.

Tips: To read your model outputs, use:

path = 'outputs/runs/F2000climo-f19_g17/atm/hist/' 
filename = path + 'F2000climo-f19_g17.cam.h0.0014-01.nc' 

Key Points

  • jupyterlab

  • hybrid sigma levels

  • xarray