{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Load CMIP6 Data with Intake ESM\n",
"\n",
"[Intake ESM](https://intake-esm.readthedocs.io/en/latest/) is an experimental new package that aims to provide a higher-level interface to searching and loading Earth System Model data archives, such as CMIP6. The packages is under very active development, and features may be unstable. Please report any issues or suggestions [on github](https://github.com/NCAR/intake-esm/issues)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import xarray as xr\n",
"xr.set_options(display_style='html')\n",
"import intake\n",
"%matplotlib inline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Intake ESM works by parsing an [ESM Collection Spec](https://github.com/NCAR/esm-collection-spec/) and converting it to an [intake catalog](https://intake.readthedocs.io/en/latest). The collection spec is stored in a .json file. Here we open it using intake."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"pangeo-cmip6-ESM Collection with 235624 entries:\n",
"\t> 15 activity_id(s)\n",
"\n",
"\t> 32 institution_id(s)\n",
"\n",
"\t> 69 source_id(s)\n",
"\n",
"\t> 101 experiment_id(s)\n",
"\n",
"\t> 135 member_id(s)\n",
"\n",
"\t> 29 table_id(s)\n",
"\n",
"\t> 313 variable_id(s)\n",
"\n",
"\t> 10 grid_label(s)\n",
"\n",
"\t> 235624 zstore(s)\n",
"\n",
"\t> 60 dcpp_init_year(s)"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cat_url = \"https://storage.googleapis.com/cmip6/pangeo-cmip6.json\"\n",
"col = intake.open_esm_datastore(cat_url)\n",
"col"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can now use intake methods to search the collection, and, if desired, export a pandas dataframe."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
activity_id
\n",
"
institution_id
\n",
"
source_id
\n",
"
experiment_id
\n",
"
member_id
\n",
"
table_id
\n",
"
variable_id
\n",
"
grid_label
\n",
"
zstore
\n",
"
dcpp_init_year
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
CMIP
\n",
"
CCCma
\n",
"
CanESM5-CanOE
\n",
"
historical
\n",
"
r1i1p2f1
\n",
"
Oyr
\n",
"
o2
\n",
"
gn
\n",
"
gs://cmip6/CMIP/CCCma/CanESM5-CanOE/historical...
\n",
"
NaN
\n",
"
\n",
"
\n",
"
1
\n",
"
CMIP
\n",
"
CCCma
\n",
"
CanESM5-CanOE
\n",
"
historical
\n",
"
r2i1p2f1
\n",
"
Oyr
\n",
"
o2
\n",
"
gn
\n",
"
gs://cmip6/CMIP/CCCma/CanESM5-CanOE/historical...
\n",
"
NaN
\n",
"
\n",
"
\n",
"
2
\n",
"
CMIP
\n",
"
CCCma
\n",
"
CanESM5-CanOE
\n",
"
historical
\n",
"
r3i1p2f1
\n",
"
Oyr
\n",
"
o2
\n",
"
gn
\n",
"
gs://cmip6/CMIP/CCCma/CanESM5-CanOE/historical...
\n",
"
NaN
\n",
"
\n",
"
\n",
"
3
\n",
"
CMIP
\n",
"
CCCma
\n",
"
CanESM5
\n",
"
historical
\n",
"
r10i1p1f1
\n",
"
Oyr
\n",
"
o2
\n",
"
gn
\n",
"
gs://cmip6/CMIP/CCCma/CanESM5/historical/r10i1...
\n",
"
NaN
\n",
"
\n",
"
\n",
"
4
\n",
"
CMIP
\n",
"
CCCma
\n",
"
CanESM5
\n",
"
historical
\n",
"
r10i1p2f1
\n",
"
Oyr
\n",
"
o2
\n",
"
gn
\n",
"
gs://cmip6/CMIP/CCCma/CanESM5/historical/r10i1...
\n",
"
NaN
\n",
"
\n",
"
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
\n",
"
\n",
"
132
\n",
"
ScenarioMIP
\n",
"
IPSL
\n",
"
IPSL-CM6A-LR
\n",
"
ssp585
\n",
"
r4i1p1f1
\n",
"
Oyr
\n",
"
o2
\n",
"
gn
\n",
"
gs://cmip6/ScenarioMIP/IPSL/IPSL-CM6A-LR/ssp58...
\n",
"
NaN
\n",
"
\n",
"
\n",
"
133
\n",
"
ScenarioMIP
\n",
"
IPSL
\n",
"
IPSL-CM6A-LR
\n",
"
ssp585
\n",
"
r6i1p1f1
\n",
"
Oyr
\n",
"
o2
\n",
"
gn
\n",
"
gs://cmip6/ScenarioMIP/IPSL/IPSL-CM6A-LR/ssp58...
\n",
"
NaN
\n",
"
\n",
"
\n",
"
134
\n",
"
ScenarioMIP
\n",
"
MIROC
\n",
"
MIROC-ES2L
\n",
"
ssp585
\n",
"
r1i1p1f2
\n",
"
Oyr
\n",
"
o2
\n",
"
gn
\n",
"
gs://cmip6/ScenarioMIP/MIROC/MIROC-ES2L/ssp585...
\n",
"
NaN
\n",
"
\n",
"
\n",
"
135
\n",
"
ScenarioMIP
\n",
"
MPI-M
\n",
"
MPI-ESM1-2-LR
\n",
"
ssp585
\n",
"
r10i1p1f1
\n",
"
Oyr
\n",
"
o2
\n",
"
gn
\n",
"
gs://cmip6/ScenarioMIP/MPI-M/MPI-ESM1-2-LR/ssp...
\n",
"
NaN
\n",
"
\n",
"
\n",
"
136
\n",
"
ScenarioMIP
\n",
"
MPI-M
\n",
"
MPI-ESM1-2-LR
\n",
"
ssp585
\n",
"
r1i1p1f1
\n",
"
Oyr
\n",
"
o2
\n",
"
gn
\n",
"
gs://cmip6/ScenarioMIP/MPI-M/MPI-ESM1-2-LR/ssp...
\n",
"
NaN
\n",
"
\n",
" \n",
"
\n",
"
137 rows × 10 columns
\n",
"
"
],
"text/plain": [
" activity_id institution_id source_id experiment_id member_id \\\n",
"0 CMIP CCCma CanESM5-CanOE historical r1i1p2f1 \n",
"1 CMIP CCCma CanESM5-CanOE historical r2i1p2f1 \n",
"2 CMIP CCCma CanESM5-CanOE historical r3i1p2f1 \n",
"3 CMIP CCCma CanESM5 historical r10i1p1f1 \n",
"4 CMIP CCCma CanESM5 historical r10i1p2f1 \n",
".. ... ... ... ... ... \n",
"132 ScenarioMIP IPSL IPSL-CM6A-LR ssp585 r4i1p1f1 \n",
"133 ScenarioMIP IPSL IPSL-CM6A-LR ssp585 r6i1p1f1 \n",
"134 ScenarioMIP MIROC MIROC-ES2L ssp585 r1i1p1f2 \n",
"135 ScenarioMIP MPI-M MPI-ESM1-2-LR ssp585 r10i1p1f1 \n",
"136 ScenarioMIP MPI-M MPI-ESM1-2-LR ssp585 r1i1p1f1 \n",
"\n",
" table_id variable_id grid_label \\\n",
"0 Oyr o2 gn \n",
"1 Oyr o2 gn \n",
"2 Oyr o2 gn \n",
"3 Oyr o2 gn \n",
"4 Oyr o2 gn \n",
".. ... ... ... \n",
"132 Oyr o2 gn \n",
"133 Oyr o2 gn \n",
"134 Oyr o2 gn \n",
"135 Oyr o2 gn \n",
"136 Oyr o2 gn \n",
"\n",
" zstore dcpp_init_year \n",
"0 gs://cmip6/CMIP/CCCma/CanESM5-CanOE/historical... NaN \n",
"1 gs://cmip6/CMIP/CCCma/CanESM5-CanOE/historical... NaN \n",
"2 gs://cmip6/CMIP/CCCma/CanESM5-CanOE/historical... NaN \n",
"3 gs://cmip6/CMIP/CCCma/CanESM5/historical/r10i1... NaN \n",
"4 gs://cmip6/CMIP/CCCma/CanESM5/historical/r10i1... NaN \n",
".. ... ... \n",
"132 gs://cmip6/ScenarioMIP/IPSL/IPSL-CM6A-LR/ssp58... NaN \n",
"133 gs://cmip6/ScenarioMIP/IPSL/IPSL-CM6A-LR/ssp58... NaN \n",
"134 gs://cmip6/ScenarioMIP/MIROC/MIROC-ES2L/ssp585... NaN \n",
"135 gs://cmip6/ScenarioMIP/MPI-M/MPI-ESM1-2-LR/ssp... NaN \n",
"136 gs://cmip6/ScenarioMIP/MPI-M/MPI-ESM1-2-LR/ssp... NaN \n",
"\n",
"[137 rows x 10 columns]"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cat = col.search(experiment_id=['historical', 'ssp585'], table_id='Oyr', variable_id='o2',\n",
" grid_label='gn')\n",
"cat.df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Intake knows how to automatically open the datasets using xarray. Furthermore, intake esm contains special logic to concatenate and merge the individual results of our query into larger, more high-level aggregated xarray datasets."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"--> The keys in the returned dictionary of datasets are constructed as follows:\n",
"\t'activity_id.institution_id.source_id.experiment_id.table_id.grid_label'\n",
" \n",
"--> There is/are 17 group(s)\n",
"[########################################] | 100% Completed | 1min 27.1s\n"
]
},
{
"data": {
"text/plain": [
"['CMIP.CCCma.CanESM5.historical.Oyr.gn',\n",
" 'CMIP.CCCma.CanESM5-CanOE.historical.Oyr.gn',\n",
" 'CMIP.CSIRO.ACCESS-ESM1-5.historical.Oyr.gn',\n",
" 'CMIP.HAMMOZ-Consortium.MPI-ESM-1-2-HAM.historical.Oyr.gn',\n",
" 'CMIP.IPSL.IPSL-CM6A-LR.historical.Oyr.gn',\n",
" 'CMIP.MIROC.MIROC-ES2L.historical.Oyr.gn',\n",
" 'CMIP.MPI-M.MPI-ESM1-2-HR.historical.Oyr.gn',\n",
" 'CMIP.MPI-M.MPI-ESM1-2-LR.historical.Oyr.gn',\n",
" 'CMIP.NCC.NorESM2-LM.historical.Oyr.gn',\n",
" 'ScenarioMIP.CCCma.CanESM5.ssp585.Oyr.gn',\n",
" 'ScenarioMIP.CCCma.CanESM5-CanOE.ssp585.Oyr.gn',\n",
" 'ScenarioMIP.CSIRO.ACCESS-ESM1-5.ssp585.Oyr.gn',\n",
" 'ScenarioMIP.DKRZ.MPI-ESM1-2-HR.ssp585.Oyr.gn',\n",
" 'ScenarioMIP.DWD.MPI-ESM1-2-HR.ssp585.Oyr.gn',\n",
" 'ScenarioMIP.IPSL.IPSL-CM6A-LR.ssp585.Oyr.gn',\n",
" 'ScenarioMIP.MIROC.MIROC-ES2L.ssp585.Oyr.gn',\n",
" 'ScenarioMIP.MPI-M.MPI-ESM1-2-LR.ssp585.Oyr.gn']"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dset_dict = cat.to_dataset_dict(zarr_kwargs={'consolidated': True})\n",
"list(dset_dict.keys())"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
'Mole concentration' means number of moles per unit volume, also called 'molarity', and is used in the construction mole_concentration_of_X_in_Y, where X is a material constituent of Y. A chemical or biological species denoted by X may be described by a single term such as 'nitrogen' or a phrase such as 'nox_expressed_as_nitrogen'.
CMIP6 model data produced by The Government of Canada (Canadian Centre for Climate Modelling and Analysis, Environment and Climate Change Canada) is licensed under a Creative Commons Attribution ShareAlike 4.0 International License (https://creativecommons.org/licenses). Consult https://pcmdi.llnl.gov/CMIP6/TermsOfUse for terms of use governing CMIP6 output, including citation requirements and proper acknowledgment. Further information about this data, including some limitations, can be found via the further_info_url (recorded as a global attribute in this file) and at https:///pcmdi.llnl.gov/. The data producers and data providers make no warranty, either express or implied, including, but not limited to, warranties of merchantability and fitness for a particular purpose. All liabilities arising from the supply of the information (including any liability arising in negligence) are excluded to the fullest extent permitted by law.
intake_esm_varname :
o2
mip_era :
CMIP6
Conventions :
CF-1.7 CMIP-6.2
variable_id :
o2
version :
v20190429
source_type :
AOGCM
variant_label :
r9i1p2f1
table_id :
Oyr
external_variables :
areacello volcello
cmor_version :
3.4.0
references :
Geophysical Model Development Special issue on CanESM5 (https://www.geosci-model-dev.net/special_issues.html)
institution_id :
CCCma
forcing_index :
1
source :
CanESM5 (2019): \n",
"aerosol: interactive\n",
"atmos: CanAM5 (T63L49 native atmosphere, T63 Linear Gaussian Grid; 128 x 64 longitude/latitude; 49 levels; top level 1 hPa)\n",
"atmosChem: specified oxidants for aerosols\n",
"land: CLASS3.6/CTEM1.2\n",
"landIce: specified ice sheets\n",
"ocean: NEMO3.4.1 (ORCA1 tripolar grid, 1 deg with refinement to 1/3 deg within 20 degrees of the equator; 361 x 290 longitude/latitude; 45 vertical levels; top grid cell 0-6.19 m)\n",
"ocnBgchem: Canadian Model of Ocean Carbon (CMOC); NPZD ecosystem with OMIP prescribed carbonate chemistry\n",
"seaIce: LIM2
2019-05-02T13:53:53Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-30T08:58:40Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.\n",
"2019-05-02T13:55:44Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-02T13:55:48Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-02T13:57:46Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-02T13:57:48Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-02T13:59:47Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-02T14:03:56Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-02T14:03:56Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-02T14:05:56Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-02T14:05:59Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-02T13:43:39Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-09T03:45:13Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-02T14:08:04Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-01T20:20:11Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-01T20:23:33Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-01T20:26:53Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-01T20:30:19Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-01T20:33:43Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-02T13:47:48Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-03T22:15:10Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-01T20:03:35Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-09T03:45:41Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-01T20:06:50Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-09T03:46:25Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-02T13:47:47Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-09T03:52:34Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-01T20:11:36Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-09T03:53:05Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-01T20:13:32Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-30T08:58:34Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.\n",
"2019-05-02T13:49:43Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-30T08:58:24Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.\n",
"2019-05-01T20:16:55Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.;\n",
"Output from $runid\n",
"2019-05-30T08:58:45Z ;rewrote data to be consistent with CMIP for variable o2 found in table Oyr.
CCCma_runid :
p2-his09
grid_label :
gn
title :
CanESM5 output prepared for CMIP6
parent_time_units :
days since 1850-01-01 0:0:0.0
grid :
ORCA1 tripolar grid, 1 deg with refinement to 1/3 deg within 20 degrees of the equator; 361 x 290 longitude/latitude; 45 vertical levels; top grid cell 0-6.19 m