Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify CRS and output projection #1269

Merged
merged 1 commit into from
Feb 8, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 63 additions & 62 deletions examples/user_guide/Geographic_Data.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -74,31 +74,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Geopandas\n",
"\n",
"Since a GeoPandas ``DataFrame`` is just a Pandas DataFrames with additional geographic information, it inherits the ``.hvplot`` method. We can thus easily load shapefiles and plot them on a map:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import geopandas as gpd\n",
"\n",
"cities = gpd.read_file(gpd.datasets.get_path('naturalearth_cities'))\n",
"\n",
"cities.hvplot(global_extent=True, frame_height=450, tiles=True)"
"### Declaring an output projection"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The GeoPandas support allows plotting ``GeoDataFrames`` containing ``'Point'``, ``'Polygon'``, ``'LineString'`` and ``'LineRing'`` geometries, but not ones containing a mixture of different geometry types. Calling ``.hvplot`` will automatically figure out the geometry type to plot, but it also possible to call ``.hvplot.points``, ``.hvplot.polygons``, and ``.hvplot.paths`` explicitly.\n",
"\n",
"To draw multiple GeoDataFrames onto the same plot, use the ``*`` operator:"
"The ``crs=`` argument specifies the *input* projection, i.e. it declares how to interpret the incoming data values. You can independently choose any *output* projection, i.e. how you want to map the data points onto the screen for display, using the ``projection=`` argument. After loading the same temperature dataset explored in the [Gridded Data](Gridded_Data.ipynb) section, the data can be displayed on an Orthographic projection:"
]
},
{
Expand All @@ -107,16 +90,20 @@
"metadata": {},
"outputs": [],
"source": [
"world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))\n",
"air_ds = xr.tutorial.open_dataset('air_temperature').load()\n",
"\n",
"world.hvplot(geo=True) * cities.hvplot(geo=True, color='orange')"
"air_ds.hvplot.quadmesh(\n",
" 'lon', 'lat', 'air', projection=ccrs.Orthographic(-90, 30),\n",
" global_extent=True, frame_height=540, cmap='viridis',\n",
" coastline=True\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It is possible to declare a specific column to use as color with the ``c`` keyword:"
"If you don't need to pass any keyword arguments to a given projection and you don't have cartopy.crs (ccrs) imported, you can use the string representation: e.g. ``'LambertConformal'`` instead of ``ccrs.LambertConformal()``. Note that it is case sensitive!"
]
},
{
Expand All @@ -125,16 +112,18 @@
"metadata": {},
"outputs": [],
"source": [
"world.hvplot(geo=True) + world.hvplot(c='continent', geo=True)"
"air_ds.hvplot.quadmesh(\n",
" 'lon', 'lat', 'air', projection='LambertConformal',\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Spatialpandas\n",
"Note that when displaying raster data in a projection other than the one in which the data is stored, it is more accurate to render it as a ``quadmesh`` rather than an ``image``. As you can see above, a QuadMesh will project each original bin or pixel into the correct non-rectangular shape determined by the projection, accurately showing the geographic extent covered by each sample. An Image, on the other hand, will always be rectangularly aligned in the 2D plane, which requires warping and resampling the data in a way that allows efficient display but loses accuracy at the pixel level. Unfortunately, rendering a large QuadMesh using Bokeh can be very slow, but there are two useful alternatives for datasets too large to be practical as native QuadMeshes.\n",
"\n",
"Spatialpandas is another powerful library for working with geometries and is optimized for rendering with datashader, making it possible to plot millions of individual geometries very quickly:"
"The first is using the ``rasterize`` or ``datashade`` options to regrid the data before rendering it, i.e., rendering the data on the backend and then sending a more efficient image-based representation to the browser. One thing to note when using these operations is that it may be necessary to project the data **before** rasterizing it, e.g. to address wrapping issues. To do this provide ``project=True``, which will project the data before it is rasterized (this also works for other types and even when not using these operations). Another reason why this is important when rasterizing the data is that if the CRS of the data does not match the displayed projection, all the data will be projected every time you zoom or pan, which can be very slow. Deciding whether to ``project`` is therefore a tradeoff between projecting the raw data ahead of time or accepting the overhead on dynamic zoom and pan actions."
]
},
{
Expand All @@ -143,25 +132,22 @@
"metadata": {},
"outputs": [],
"source": [
"import spatialpandas as spd\n",
"rasm = xr.tutorial.open_dataset('rasm').load()\n",
"\n",
"spd_world = spd.GeoDataFrame(world)\n",
"\n",
"spd_world.hvplot(datashade=True, project=True, aggregator='count_cat', c='continent', color_key='Category10')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Declaring an output projection"
"\n",
"rasm.hvplot.quadmesh(\n",
" 'xc', 'yc', crs=ccrs.PlateCarree(), projection=ccrs.PlateCarree(),\n",
" ylim=(0, 90), cmap='viridis', project=True, geo=True,\n",
" rasterize=True, coastline=True, frame_width=800, dynamic=False,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The ``crs=`` argument specifies the *input* projection, i.e. it declares how to interpret the incoming data values. You can independently choose any *output* projection, i.e. how you want to map the data points onto the screen for display, using the ``projection=`` argument. After loading the same temperature dataset explored in the [Gridded Data](Gridded_Data.ipynb) section, the data can be displayed on an Orthographic projection:"
"Another option that's still relatively slow for larger data but avoids sending large data into your browser is to plot the data using ``contour`` and ``contourf`` visualizations, generating a line or filled contour with a discrete number of levels:"
]
},
{
Expand All @@ -170,11 +156,9 @@
"metadata": {},
"outputs": [],
"source": [
"air_ds = xr.tutorial.open_dataset('air_temperature').load()\n",
"\n",
"air_ds.hvplot.quadmesh(\n",
" 'lon', 'lat', 'air', projection=ccrs.Orthographic(-90, 30),\n",
" global_extent=True, frame_height=540, cmap='viridis',\n",
"rasm.hvplot.contourf(\n",
" 'xc', 'yc', crs=ccrs.PlateCarree(), projection=ccrs.PlateCarree(),\n",
" ylim=(0, 90), frame_width=800, cmap='viridis', levels=10,\n",
" coastline=True\n",
")"
]
Expand All @@ -183,7 +167,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"If you don't need to pass any keyword arguments to a given projection and you don't have cartopy.crs (ccrs) imported, you can use the string representation: e.g. ``'LambertConformal'`` instead of ``ccrs.LambertConformal()``. Note that it is case sensitive!"
"### Geopandas\n",
"\n",
"Since a GeoPandas ``DataFrame`` is just a Pandas DataFrames with additional geographic information, it inherits the ``.hvplot`` method. We can thus easily load shapefiles and plot them on a map:"
]
},
{
Expand All @@ -192,18 +178,20 @@
"metadata": {},
"outputs": [],
"source": [
"air_ds.hvplot.quadmesh(\n",
" 'lon', 'lat', 'air', projection='LambertConformal',\n",
")"
"import geopandas as gpd\n",
"\n",
"cities = gpd.read_file(gpd.datasets.get_path('naturalearth_cities'))\n",
"\n",
"cities.hvplot(global_extent=True, frame_height=450, tiles=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that when displaying raster data in a projection other than the one in which the data is stored, it is more accurate to render it as a ``quadmesh`` rather than an ``image``. As you can see above, a QuadMesh will project each original bin or pixel into the correct non-rectangular shape determined by the projection, accurately showing the geographic extent covered by each sample. An Image, on the other hand, will always be rectangularly aligned in the 2D plane, which requires warping and resampling the data in a way that allows efficient display but loses accuracy at the pixel level. Unfortunately, rendering a large QuadMesh using Bokeh can be very slow, but there are two useful alternatives for datasets too large to be practical as native QuadMeshes.\n",
"The GeoPandas support allows plotting ``GeoDataFrames`` containing ``'Point'``, ``'Polygon'``, ``'LineString'`` and ``'LineRing'`` geometries, but not ones containing a mixture of different geometry types. Calling ``.hvplot`` will automatically figure out the geometry type to plot, but it also possible to call ``.hvplot.points``, ``.hvplot.polygons``, and ``.hvplot.paths`` explicitly.\n",
"\n",
"The first is using the ``rasterize`` or ``datashade`` options to regrid the data before rendering it, i.e., rendering the data on the backend and then sending a more efficient image-based representation to the browser. One thing to note when using these operations is that it may be necessary to project the data **before** rasterizing it, e.g. to address wrapping issues. To do this provide ``project=True``, which will project the data before it is rasterized (this also works for other types and even when not using these operations). Another reason why this is important when rasterizing the data is that if the CRS of the data does not match the displayed projection, all the data will be projected every time you zoom or pan, which can be very slow. Deciding whether to ``project`` is therefore a tradeoff between projecting the raw data ahead of time or accepting the overhead on dynamic zoom and pan actions."
"To draw multiple GeoDataFrames onto the same plot, use the ``*`` operator:"
]
},
{
Expand All @@ -212,22 +200,16 @@
"metadata": {},
"outputs": [],
"source": [
"rasm = xr.tutorial.open_dataset('rasm').load()\n",
"\n",
"\n",
"world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))\n",
"\n",
"rasm.hvplot.quadmesh(\n",
" 'xc', 'yc', crs=ccrs.PlateCarree(), projection=ccrs.PlateCarree(),\n",
" ylim=(0, 90), cmap='viridis', project=True, geo=True,\n",
" rasterize=True, coastline=True, frame_width=800, dynamic=False,\n",
")"
"world.hvplot(geo=True) * cities.hvplot(geo=True, color='orange')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Another option that's still relatively slow for larger data but avoids sending large data into your browser is to plot the data using ``contour`` and ``contourf`` visualizations, generating a line or filled contour with a discrete number of levels:"
"It is possible to declare a specific column to use as color with the ``c`` keyword:"
]
},
{
Expand All @@ -236,11 +218,29 @@
"metadata": {},
"outputs": [],
"source": [
"rasm.hvplot.contourf(\n",
" 'xc', 'yc', crs=ccrs.PlateCarree(), projection=ccrs.PlateCarree(),\n",
" ylim=(0, 90), frame_width=800, cmap='viridis', levels=10,\n",
" coastline=True\n",
")"
"world.hvplot(geo=True) + world.hvplot(c='continent', geo=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import spatialpandas as spd\n",
"\n",
"spd_world = spd.GeoDataFrame(world)\n",
"\n",
"spd_world.hvplot(datashade=True, project=True, aggregator='count_cat', c='continent', color_key='Category10')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Spatialpandas\n",
"\n",
"Spatialpandas is another powerful library for working with geometries and is optimized for rendering with datashader, making it possible to plot millions of individual geometries very quickly:"
]
},
{
Expand All @@ -259,7 +259,8 @@
"The API provides various geo-specific options:\n",
"\n",
"- ``coastline`` (default=False): Whether to display a coastline on top of the plot, setting ``coastline='10m'/'50m'/'110m'`` specifies a specific scale\n",
"- ``crs`` (default=None): Coordinate reference system of the data specified as Cartopy CRS object, proj.4 string or EPSG code\n",
"- ``crs`` (default=None): Coordinate reference system of the data (input projection) specified as Cartopy CRS object or name, proj.4 string or EPSG code \n",
"- ``projection`` (default=None): Coordinate reference system of the plot (output projection) specified as Cartopy CRS object or name, proj.4 string or EPSG code\n",
"- ``features`` features (default=None): A list of features or a dictionary of features and the scale at which to render it. Available features include 'borders', 'coastline', 'lakes', 'land', 'ocean', 'rivers' and 'states'. Available scales include '10m'/'50m'/'110m'.\n",
"- ``geo`` (default=False): Whether the plot should be treated as geographic (and assume PlateCarree, i.e. lat/lon coordinates)\n",
"- ``global_extent`` (default=False): Whether to expand the plot extent to span the whole globe\n",
Expand Down
Loading