Manipulation of CMIP6 model data using Pangeo catalog
Contents
Manipulation of CMIP6 model data using Pangeo catalog¶
Import python packages¶
import xarray as xr
xr.set_options(display_style='html')
import intake
import cftime
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
%matplotlib inline
Open CMIP6 online catalog¶
cat_url = "https://storage.googleapis.com/cmip6/pangeo-cmip6.json"
col = intake.open_esm_datastore(cat_url)
col
pangeo-cmip6 catalog with 7693 dataset(s) from 518834 asset(s):
unique | |
---|---|
activity_id | 18 |
institution_id | 36 |
source_id | 88 |
experiment_id | 170 |
member_id | 657 |
table_id | 37 |
variable_id | 709 |
grid_label | 10 |
zstore | 518834 |
dcpp_init_year | 60 |
version | 718 |
Search corresponding data¶
cat = col.search(source_id=['CESM2'], experiment_id=['historical'], table_id=['Amon'], variable_id=['tas'], member_id=['r1i1p1f1'])
cat.df
activity_id | institution_id | source_id | experiment_id | member_id | table_id | variable_id | grid_label | zstore | dcpp_init_year | version | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | CMIP | NCAR | CESM2 | historical | r1i1p1f1 | Amon | tas | gn | gs://cmip6/CMIP6/CMIP/NCAR/CESM2/historical/r1... | NaN | 20190308 |
Create dictionary from the list of datasets we found¶
This step may take several minutes so be patient!
dset_dict = cat.to_dataset_dict(zarr_kwargs={'use_cftime':True})
--> The keys in the returned dictionary of datasets are constructed as follows:
'activity_id.institution_id.source_id.experiment_id.table_id.grid_label'
100.00% [1/1 00:00<00:00]
list(dset_dict.keys())
['CMIP.NCAR.CESM2.historical.Amon.gn']
Open dataset¶
Use
xarray
python package to analyze netCDF datasetopen_dataset
allows to get all the metadata without loading data into memory.with
xarray
, we only load into memory what is needed.
dset = dset_dict['CMIP.NCAR.CESM2.historical.Amon.gn']
dset
<xarray.Dataset> Dimensions: (lat: 192, lon: 288, member_id: 1, nbnd: 2, time: 1980) Coordinates: * lat (lat) float64 -90.0 -89.06 -88.12 -87.17 ... 88.12 89.06 90.0 lat_bnds (lat, nbnd) float32 dask.array<chunksize=(192, 2), meta=np.ndarray> * lon (lon) float64 0.0 1.25 2.5 3.75 5.0 ... 355.0 356.2 357.5 358.8 lon_bnds (lon, nbnd) float32 dask.array<chunksize=(288, 2), meta=np.ndarray> * time (time) object 1850-01-15 12:00:00 ... 2014-12-15 12:00:00 time_bnds (time, nbnd) object dask.array<chunksize=(1980, 2), meta=np.ndarray> * member_id (member_id) <U8 'r1i1p1f1' Dimensions without coordinates: nbnd Data variables: tas (member_id, time, lat, lon) float32 dask.array<chunksize=(1, 600, 192, 288), meta=np.ndarray> Attributes: (12/48) Conventions: CF-1.7 CMIP-6.2 activity_id: CMIP branch_method: standard branch_time_in_child: 674885.0 branch_time_in_parent: 219000.0 case_id: 15 ... ... variable_id: tas variant_info: CMIP6 20th century experiments (1850-2014) with ... variant_label: r1i1p1f1 status: 2019-10-25;created;by nhn2@columbia.edu intake_esm_varname: ['tas'] intake_esm_dataset_key: CMIP.NCAR.CESM2.historical.Amon.gn
xarray.Dataset
- lat: 192
- lon: 288
- member_id: 1
- nbnd: 2
- time: 1980
- lat(lat)float64-90.0 -89.06 -88.12 ... 89.06 90.0
- axis :
- Y
- bounds :
- lat_bnds
- standard_name :
- latitude
- title :
- Latitude
- type :
- double
- units :
- degrees_north
- valid_max :
- 90.0
- valid_min :
- -90.0
array([-90. , -89.057592, -88.115183, -87.172775, -86.230366, -85.287958, -84.34555 , -83.403141, -82.460733, -81.518325, -80.575916, -79.633508, -78.691099, -77.748691, -76.806283, -75.863874, -74.921466, -73.979058, -73.036649, -72.094241, -71.151832, -70.209424, -69.267016, -68.324607, -67.382199, -66.439791, -65.497382, -64.554974, -63.612565, -62.670157, -61.727749, -60.78534 , -59.842932, -58.900524, -57.958115, -57.015707, -56.073298, -55.13089 , -54.188482, -53.246073, -52.303665, -51.361257, -50.418848, -49.47644 , -48.534031, -47.591623, -46.649215, -45.706806, -44.764398, -43.82199 , -42.879581, -41.937173, -40.994764, -40.052356, -39.109948, -38.167539, -37.225131, -36.282723, -35.340314, -34.397906, -33.455497, -32.513089, -31.570681, -30.628272, -29.685864, -28.743455, -27.801047, -26.858639, -25.91623 , -24.973822, -24.031414, -23.089005, -22.146597, -21.204188, -20.26178 , -19.319372, -18.376963, -17.434555, -16.492147, -15.549738, -14.60733 , -13.664921, -12.722513, -11.780105, -10.837696, -9.895288, -8.95288 , -8.010471, -7.068063, -6.125654, -5.183246, -4.240838, -3.298429, -2.356021, -1.413613, -0.471204, 0.471204, 1.413613, 2.356021, 3.298429, 4.240838, 5.183246, 6.125654, 7.068063, 8.010471, 8.95288 , 9.895288, 10.837696, 11.780105, 12.722513, 13.664921, 14.60733 , 15.549738, 16.492147, 17.434555, 18.376963, 19.319372, 20.26178 , 21.204188, 22.146597, 23.089005, 24.031414, 24.973822, 25.91623 , 26.858639, 27.801047, 28.743455, 29.685864, 30.628272, 31.570681, 32.513089, 33.455497, 34.397906, 35.340314, 36.282723, 37.225131, 38.167539, 39.109948, 40.052356, 40.994764, 41.937173, 42.879581, 43.82199 , 44.764398, 45.706806, 46.649215, 47.591623, 48.534031, 49.47644 , 50.418848, 51.361257, 52.303665, 53.246073, 54.188482, 55.13089 , 56.073298, 57.015707, 57.958115, 58.900524, 59.842932, 60.78534 , 61.727749, 62.670157, 63.612565, 64.554974, 65.497382, 66.439791, 67.382199, 68.324607, 69.267016, 70.209424, 71.151832, 72.094241, 73.036649, 73.979058, 74.921466, 75.863874, 76.806283, 77.748691, 78.691099, 79.633508, 80.575916, 81.518325, 82.460733, 83.403141, 84.34555 , 85.287958, 86.230366, 87.172775, 88.115183, 89.057592, 90. ])
- lat_bnds(lat, nbnd)float32dask.array<chunksize=(192, 2), meta=np.ndarray>
- units :
- degrees_north
Array Chunk Bytes 1.54 kB 1.54 kB Shape (192, 2) (192, 2) Count 2 Tasks 1 Chunks Type float32 numpy.ndarray - lon(lon)float640.0 1.25 2.5 ... 356.2 357.5 358.8
- axis :
- X
- bounds :
- lon_bnds
- standard_name :
- longitude
- title :
- Longitude
- type :
- double
- units :
- degrees_east
- valid_max :
- 360.0
- valid_min :
- 0.0
array([ 0. , 1.25, 2.5 , ..., 356.25, 357.5 , 358.75])
- lon_bnds(lon, nbnd)float32dask.array<chunksize=(288, 2), meta=np.ndarray>
- units :
- degrees_east
Array Chunk Bytes 2.30 kB 2.30 kB Shape (288, 2) (288, 2) Count 2 Tasks 1 Chunks Type float32 numpy.ndarray - time(time)object1850-01-15 12:00:00 ... 2014-12-...
- axis :
- T
- bounds :
- time_bnds
- standard_name :
- time
- title :
- time
- type :
- double
array([cftime.DatetimeNoLeap(1850, 1, 15, 12, 0, 0, 0), cftime.DatetimeNoLeap(1850, 2, 14, 0, 0, 0, 0), cftime.DatetimeNoLeap(1850, 3, 15, 12, 0, 0, 0), ..., cftime.DatetimeNoLeap(2014, 10, 15, 12, 0, 0, 0), cftime.DatetimeNoLeap(2014, 11, 15, 0, 0, 0, 0), cftime.DatetimeNoLeap(2014, 12, 15, 12, 0, 0, 0)], dtype=object)
- time_bnds(time, nbnd)objectdask.array<chunksize=(1980, 2), meta=np.ndarray>
Array Chunk Bytes 31.68 kB 31.68 kB Shape (1980, 2) (1980, 2) Count 2 Tasks 1 Chunks Type object numpy.ndarray - member_id(member_id)<U8'r1i1p1f1'
array(['r1i1p1f1'], dtype='<U8')
- tas(member_id, time, lat, lon)float32dask.array<chunksize=(1, 600, 192, 288), meta=np.ndarray>
- cell_measures :
- area: areacella
- cell_methods :
- area: time: mean
- comment :
- near-surface (usually, 2 meter) air temperature
- description :
- near-surface (usually, 2 meter) air temperature
- frequency :
- mon
- id :
- tas
- long_name :
- Near-Surface Air Temperature
- mipTable :
- Amon
- out_name :
- tas
- prov :
- Amon ((isd.003))
- realm :
- atmos
- standard_name :
- air_temperature
- time :
- time
- time_label :
- time-mean
- time_title :
- Temporal mean
- title :
- Near-Surface Air Temperature
- type :
- real
- units :
- K
- variable_id :
- tas
Array Chunk Bytes 437.94 MB 132.71 MB Shape (1, 1980, 192, 288) (1, 600, 192, 288) Count 9 Tasks 4 Chunks Type float32 numpy.ndarray
- Conventions :
- CF-1.7 CMIP-6.2
- activity_id :
- CMIP
- branch_method :
- standard
- branch_time_in_child :
- 674885.0
- branch_time_in_parent :
- 219000.0
- case_id :
- 15
- cesm_casename :
- b.e21.BHIST.f09_g17.CMIP6-historical.001
- contact :
- cesm_cmip6@ucar.edu
- creation_date :
- 2019-01-16T23:34:05Z
- data_specs_version :
- 01.00.29
- experiment :
- all-forcing simulation of the recent past
- experiment_id :
- historical
- external_variables :
- areacella
- forcing_index :
- 1
- frequency :
- mon
- further_info_url :
- https://furtherinfo.es-doc.org/CMIP6.NCAR.CESM2.historical.none.r1i1p1f1
- grid :
- native 0.9x1.25 finite volume grid (192x288 latxlon)
- grid_label :
- gn
- initialization_index :
- 1
- institution :
- National Center for Atmospheric Research, Climate and Global Dynamics Laboratory, 1850 Table Mesa Drive, Boulder, CO 80305, USA
- institution_id :
- NCAR
- license :
- CMIP6 model data produced by <The National Center for Atmospheric Research> is licensed under a Creative Commons Attribution-[]ShareAlike 4.0 International License (https://creativecommons.org/licenses/). Consult https://pcmdi.llnl.gov/CMIP6/TermsOfUse for terms of use governing CMIP6 output, including citation requirements and proper acknowledgment. Further information about this data, including some limitations, can be found via the further_info_url (recorded as a global attribute in this file)[]. The data producers and data providers make no warranty, either express or implied, including, but not limited to, warranties of merchantability and fitness for a particular purpose. All liabilities arising from the supply of the information (including any liability arising in negligence) are excluded to the fullest extent permitted by law.
- mip_era :
- CMIP6
- model_doi_url :
- https://doi.org/10.5065/D67H1H0V
- nominal_resolution :
- 100 km
- parent_activity_id :
- CMIP
- parent_experiment_id :
- piControl
- parent_mip_era :
- CMIP6
- parent_source_id :
- CESM2
- parent_time_units :
- days since 0001-01-01 00:00:00
- parent_variant_label :
- r1i1p1f1
- physics_index :
- 1
- product :
- model-output
- realization_index :
- 1
- realm :
- atmos
- source :
- CESM2 (2017): atmosphere: CAM6 (0.9x1.25 finite volume grid; 288 x 192 longitude/latitude; 32 levels; top level 2.25 mb); ocean: POP2 (320x384 longitude/latitude; 60 levels; top grid cell 0-10 m); sea_ice: CICE5.1 (same grid as ocean); land: CLM5 0.9x1.25 finite volume grid; 288 x 192 longitude/latitude; 32 levels; top level 2.25 mb); aerosol: MAM4 (0.9x1.25 finite volume grid; 288 x 192 longitude/latitude; 32 levels; top level 2.25 mb); atmoschem: MAM4 (0.9x1.25 finite volume grid; 288 x 192 longitude/latitude; 32 levels; top level 2.25 mb); landIce: CISM2.1; ocnBgchem: MARBL (320x384 longitude/latitude; 60 levels; top grid cell 0-10 m)
- source_id :
- CESM2
- source_type :
- AOGCM BGC
- sub_experiment :
- none
- sub_experiment_id :
- none
- table_id :
- Amon
- tracking_id :
- hdl:21.14100/d9a7225a-49c3-4470-b7ab-a8180926f839
- variable_id :
- tas
- variant_info :
- CMIP6 20th century experiments (1850-2014) with CAM6, interactive land (CLM5), coupled ocean (POP2) with biogeochemistry (MARBL), interactive sea ice (CICE5.1), and non-evolving land ice (CISM2.1)
- variant_label :
- r1i1p1f1
- status :
- 2019-10-25;created;by nhn2@columbia.edu
- intake_esm_varname :
- ['tas']
- intake_esm_dataset_key :
- CMIP.NCAR.CESM2.historical.Amon.gn
Get metadata corresponding to near-surface air temperature (tas)¶
print(dset['tas'])
<xarray.DataArray 'tas' (member_id: 1, time: 1980, lat: 192, lon: 288)>
dask.array<broadcast_to, shape=(1, 1980, 192, 288), dtype=float32, chunksize=(1, 600, 192, 288), chunktype=numpy.ndarray>
Coordinates:
* lat (lat) float64 -90.0 -89.06 -88.12 -87.17 ... 88.12 89.06 90.0
* lon (lon) float64 0.0 1.25 2.5 3.75 5.0 ... 355.0 356.2 357.5 358.8
* time (time) object 1850-01-15 12:00:00 ... 2014-12-15 12:00:00
* member_id (member_id) <U8 'r1i1p1f1'
Attributes: (12/19)
cell_measures: area: areacella
cell_methods: area: time: mean
comment: near-surface (usually, 2 meter) air temperature
description: near-surface (usually, 2 meter) air temperature
frequency: mon
id: tas
... ...
time_label: time-mean
time_title: Temporal mean
title: Near-Surface Air Temperature
type: real
units: K
variable_id: tas
dset.time.values
array([cftime.DatetimeNoLeap(1850, 1, 15, 12, 0, 0, 0),
cftime.DatetimeNoLeap(1850, 2, 14, 0, 0, 0, 0),
cftime.DatetimeNoLeap(1850, 3, 15, 12, 0, 0, 0), ...,
cftime.DatetimeNoLeap(2014, 10, 15, 12, 0, 0, 0),
cftime.DatetimeNoLeap(2014, 11, 15, 0, 0, 0, 0),
cftime.DatetimeNoLeap(2014, 12, 15, 12, 0, 0, 0)], dtype=object)
Select time¶
Select a specific time
dset['tas'].sel(time=cftime.DatetimeNoLeap(1850, 1, 15, 12, 0, 0, 0, 2, 15)).plot(cmap = 'coolwarm')
<matplotlib.collections.QuadMesh at 0x7f6dab2fcd30>
select the nearest time. Here from 1st April 1950
dset['tas'].sel(time=cftime.DatetimeNoLeap(1850, 4, 1), method='nearest').plot(cmap='coolwarm')
<matplotlib.collections.QuadMesh at 0x7f6dabc460a0>
Customize plot¶
Set the size of the figure and add coastlines¶
fig = plt.figure(1, figsize=[30,13])
# Set the projection to use for plotting
ax = plt.subplot(1, 1, 1, projection=ccrs.PlateCarree())
ax.coastlines()
# Pass ax as an argument when plotting. Here we assume data is in the same coordinate reference system than the projection chosen for plotting
# isel allows to select by indices instead of the time values
dset['tas'].isel(time=0).squeeze().plot.pcolormesh(ax=ax, cmap='coolwarm')
<matplotlib.collections.QuadMesh at 0x7f6dab555fd0>
Change plotting projection¶
fig = plt.figure(1, figsize=[10,10])
# We're using cartopy and are plotting in Orthographic projection
# (see documentation on cartopy)
ax = plt.subplot(1, 1, 1, projection=ccrs.Orthographic(0, 90))
ax.coastlines()
# We need to project our data to the new Orthographic projection and for this we use `transform`.
# we set the original data projection in transform (here PlateCarree)
dset['tas'].isel(time=0).squeeze().plot(ax=ax, transform=ccrs.PlateCarree(), cmap='coolwarm')
# One way to customize your title
plt.title(dset.time.values[0].strftime("%B %Y"), fontsize=18)
Text(0.5, 1.0, 'January 1850')
Choose the extent of values¶
Fix your minimum and maximum values in your plot and
Use extend so values below the minimum and max
fig = plt.figure(1, figsize=[10,10])
ax = plt.subplot(1, 1, 1, projection=ccrs.Orthographic(0, 90))
ax.coastlines()
# Fix extent
minval = 240
maxval = 300
# pass extent with vmin and vmax parameters
dset['tas'].isel(time=0).squeeze().plot(ax=ax, vmin=minval, vmax=maxval, transform=ccrs.PlateCarree(), cmap='coolwarm')
# One way to customize your title
plt.title(dset.time.values[0].strftime("%B %Y"), fontsize=18)
Text(0.5, 1.0, 'January 1850')
Multiplots¶
Faceting¶
proj_plot = ccrs.Orthographic(0, 90)
p = dset['tas'].sel(time = dset.time.dt.year.isin([1850, 2014])).squeeze().plot(x='lon', y='lat',
transform=ccrs.PlateCarree(),
aspect=dset.dims["lon"] / dset.dims["lat"], # for a sensible figsize
subplot_kws={"projection": proj_plot},
col='time', col_wrap=6, robust=True, cmap='PiYG')
# We have to set the map's options on all four axes
for ax,i in zip(p.axes.flat, dset.time.sel(time = dset.time.dt.year.isin([1850, 2014])).values):
ax.coastlines()
ax.set_title(i.strftime("%B %Y"), fontsize=18)
/opt/conda/lib/python3.8/site-packages/xarray/plot/facetgrid.py:390: UserWarning: Tight layout not applied. The left and right margins cannot be made large enough to accommodate all axes decorations.
self.fig.tight_layout()
Combine plots with different projections¶
fig = plt.figure(1, figsize=[20,10])
# Fix extent
minval = 240
maxval = 300
# Plot 1 for Northern Hemisphere subplot argument (nrows, ncols, nplot)
# here 1 row, 2 columns and 1st plot
ax1 = plt.subplot(1, 2, 1, projection=ccrs.Orthographic(0, 90))
# Plot 2 for Southern Hemisphere
# 2nd plot
ax2 = plt.subplot(1, 2, 2, projection=ccrs.Orthographic(180, -90))
tsel = 0
for ax,t in zip([ax1, ax2], ["Northern", "Southern"]):
map = dset['tas'].isel(time=tsel).squeeze().plot(ax=ax, vmin=minval, vmax=maxval,
transform=ccrs.PlateCarree(),
cmap='coolwarm',
add_colorbar=False)
ax.set_title(t + " Hemisphere \n" , fontsize=15)
ax.coastlines()
ax.gridlines()
# Title for both plots
fig.suptitle('Near Surface Temperature\n' + dset.time.values[tsel].strftime("%B %Y"), fontsize=20)
cb_ax = fig.add_axes([0.325, 0.05, 0.4, 0.04])
cbar = plt.colorbar(map, cax=cb_ax, extend='both', orientation='horizontal', fraction=0.046, pad=0.04)
cbar.ax.tick_params(labelsize=25)
cbar.ax.set_ylabel('K', fontsize=25)
Text(0, 0.5, 'K')