diff --git a/merge_climate_datasets_exercise.ipynb b/merge_climate_datasets_exercise.ipynb new file mode 100644 index 0000000..3b251be --- /dev/null +++ b/merge_climate_datasets_exercise.ipynb @@ -0,0 +1,3759 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "aa24b099", + "metadata": {}, + "source": [ + "# Merging Climate Datasets Exercise\n", + "\n", + "Work through this notebook to practice harmonizing and merging two climate datasets that differ in temporal cadence and spatial resolution.\n", + "\n", + "You will: \n", + "- Load two public NOAA datasets directly from the cloud\n", + "- Subset to the continental US (use 230°E–300°E in longitude since the data span 0–360°)\n", + "- Use `xr.resample` to aggregate time and `xr.interp` to match grids\n", + "- Combine the variables with `xr.merge` for joint analysis\n", + "\n", + "Refer back to the answer key after attempting each step.\n" + ] + }, + { + "cell_type": "markdown", + "id": "d6f677f5", + "metadata": {}, + "source": [ + "## 1. Setup\n" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "0a656265", + "metadata": { + "tags": [ + "parameters" + ] + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "import xarray as xr\n", + "\n", + "try:\n", + " import cartopy.crs as ccrs\n", + " import cartopy.feature as cfeature\n", + "except ImportError:\n", + " ccrs = None\n", + " cfeature = None\n", + "\n", + "TEMP_URL = \"https://psl.noaa.gov/thredds/dodsC/Datasets/ncep.reanalysis/surface/air.sig995.2020.nc\"\n", + "PRECIP_URL = \"https://psl.noaa.gov/thredds/dodsC/Datasets/cpc_global_precip/precip.2020.nc\"\n", + "\n", + "LAT_RANGE = (50, 20) # degrees North\n", + "LON_RANGE_360 = (230, 300) # degrees East (equivalent to -130° to -60°)\n", + "LON_RANGE_180 = (-130, -60) # convenience if a dataset uses -180° to 180°\n", + "\n", + "TIME_RANGE = slice(\"2020-06-01\", \"2020-06-30\")\n" + ] + }, + { + "cell_type": "markdown", + "id": "45f8536b", + "metadata": {}, + "source": [ + "## 2. Load the datasets\n", + "\n", + "Open both remote datasets with `xr.open_dataset`, passing a reasonable chunk size for the time dimension. Assign the resulting objects to `air` and `precip`.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "3270985f", + "metadata": { + "tags": [ + "exercise" + ] + }, + "outputs": [ + { + "data": { + "text/html": [ + "
<xarray.Dataset> Size: 62MB\n", + "Dimensions: (time: 1464, lat: 73, lon: 144)\n", + "Coordinates:\n", + " * lat (lat) float32 292B 90.0 87.5 85.0 82.5 ... -82.5 -85.0 -87.5 -90.0\n", + " * lon (lon) float32 576B 0.0 2.5 5.0 7.5 10.0 ... 350.0 352.5 355.0 357.5\n", + " * time (time) datetime64[ns] 12kB 2020-01-01 ... 2020-12-31T18:00:00\n", + "Data variables:\n", + " air (time, lat, lon) float32 62MB dask.array<chunksize=(8, 73, 144), meta=np.ndarray>\n", + "Attributes:\n", + " Conventions: COARDS\n", + " title: 4x daily NMC reanalysis (2014)\n", + " history: created 2017/12 by Hoop (netCDF2.3)\n", + " description: Data is from NMC initialized reanalysis\\...\n", + " platform: Model\n", + " dataset_title: NCEP-NCAR Reanalysis 1\n", + " _NCProperties: version=2,netcdf=4.6.3,hdf5=1.10.5\n", + " References: http://www.psl.noaa.gov/data/gridded/dat...\n", + " DODS_EXTRA.Unlimited_Dimension: time
<xarray.DataArray 'precip' (time: 30, lat: 60, lon: 140)> Size: 1MB\n", + "dask.array<getitem, shape=(30, 60, 140), dtype=float32, chunksize=(8, 60, 140), chunktype=numpy.ndarray>\n", + "Coordinates:\n", + " * lat (lat) float32 240B 49.75 49.25 48.75 48.25 ... 21.25 20.75 20.25\n", + " * lon (lon) float32 560B 230.2 230.8 231.2 231.8 ... 298.8 299.2 299.8\n", + " * time (time) datetime64[ns] 240B 2020-06-01 2020-06-02 ... 2020-06-30\n", + "Attributes:\n", + " units: mm\n", + " var_desc: Precipitation\n", + " level_desc: Surface\n", + " statistic: Total\n", + " parent_stat: Other\n", + " long_name: Daily total of precipitation\n", + " cell_methods: time: sum\n", + " valid_range: [ 0. 1000.]\n", + " avg_period: 0000-00-01 00:00:00\n", + " actual_range: [ 0. 776.75]\n", + " dataset: CPC Global Precipitation\n", + " _ChunkSizes: [ 1 360 720]
<xarray.Dataset> Size: 46kB\n", + "Dimensions: (time: 30, lat: 13, lon: 29)\n", + "Coordinates:\n", + " * lat (lat) float32 52B 50.0 47.5 45.0 42.5 40.0 ... 27.5 25.0 22.5 20.0\n", + " * lon (lon) float32 116B 230.0 232.5 235.0 237.5 ... 295.0 297.5 300.0\n", + " * time (time) datetime64[ns] 240B 2020-06-01 2020-06-02 ... 2020-06-30\n", + "Data variables:\n", + " air (time, lat, lon) float32 45kB dask.array<chunksize=(1, 13, 29), meta=np.ndarray>\n", + "Attributes:\n", + " Conventions: COARDS\n", + " title: 4x daily NMC reanalysis (2014)\n", + " history: created 2017/12 by Hoop (netCDF2.3)\n", + " description: Data is from NMC initialized reanalysis\\...\n", + " platform: Model\n", + " dataset_title: NCEP-NCAR Reanalysis 1\n", + " _NCProperties: version=2,netcdf=4.6.3,hdf5=1.10.5\n", + " References: http://www.psl.noaa.gov/data/gridded/dat...\n", + " DODS_EXTRA.Unlimited_Dimension: time
<xarray.Dataset> Size: 1MB\n", + "Dimensions: (time: 30, lat: 60, lon: 140)\n", + "Coordinates:\n", + " * time (time) datetime64[ns] 240B 2020-06-01 2020-06-02 ... 2020-06-30\n", + " * lat (lat) float32 240B 49.75 49.25 48.75 48.25 ... 21.25 20.75 20.25\n", + " * lon (lon) float32 560B 230.2 230.8 231.2 231.8 ... 298.8 299.2 299.8\n", + "Data variables:\n", + " air (time, lat, lon) float32 1MB dask.array<chunksize=(1, 60, 140), meta=np.ndarray>\n", + "Attributes:\n", + " Conventions: COARDS\n", + " title: 4x daily NMC reanalysis (2014)\n", + " history: created 2017/12 by Hoop (netCDF2.3)\n", + " description: Data is from NMC initialized reanalysis\\...\n", + " platform: Model\n", + " dataset_title: NCEP-NCAR Reanalysis 1\n", + " _NCProperties: version=2,netcdf=4.6.3,hdf5=1.10.5\n", + " References: http://www.psl.noaa.gov/data/gridded/dat...\n", + " DODS_EXTRA.Unlimited_Dimension: time
<xarray.Dataset> Size: 2MB\n", + "Dimensions: (time: 30, lat: 60, lon: 140)\n", + "Coordinates:\n", + " * time (time) datetime64[ns] 240B 2020-06-01 ... 2020-06-30\n", + " * lat (lat) float32 240B 49.75 49.25 48.75 ... 21.25 20.75 20.25\n", + " * lon (lon) float32 560B 230.2 230.8 231.2 ... 298.8 299.2 299.8\n", + "Data variables:\n", + " air_temperature (time, lat, lon) float32 1MB dask.array<chunksize=(1, 60, 140), meta=np.ndarray>\n", + " daily_precip (time, lat, lon) float32 1MB dask.array<chunksize=(8, 60, 140), meta=np.ndarray>\n", + "Attributes:\n", + " Conventions: COARDS\n", + " title: 4x daily NMC reanalysis (2014)\n", + " history: created 2017/12 by Hoop (netCDF2.3)\n", + " description: Data is from NMC initialized reanalysis\\...\n", + " platform: Model\n", + " dataset_title: NCEP-NCAR Reanalysis 1\n", + " _NCProperties: version=2,netcdf=4.6.3,hdf5=1.10.5\n", + " References: http://www.psl.noaa.gov/data/gridded/dat...\n", + " DODS_EXTRA.Unlimited_Dimension: time