Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing an NCSS Driver #25

Open
jthielen opened this issue Feb 19, 2021 · 3 comments
Open

Implementing an NCSS Driver #25

jthielen opened this issue Feb 19, 2021 · 3 comments

Comments

@jthielen
Copy link

Coming from the MetPy/siphon world and wanting to better integrate my workflows into the Pangeo ecosystem, I'd like to start using intake-thredds to work with TDSs. While it looks like the package currently makes use of opendap and full netcdf file access straightforward, there is nothing yet to handle use of the NetCDF Subset Service (NCSS), which is a convenient feature of siphon.

Would there be support for including an NCSS intake driver in intake-thredds? If so, I'd be glad to put together a PR as I work to incorporate intake into my workflows (and understand how intake drivers work in more detail).

@aaronspring
Copy link
Collaborator

I always wondered what these different access paths can do.

Is this ncss a driver that would be handled here in intake-thredds or could it also go into intake-xarray (more close to xr.open_dataset and a dependency of intake-thredds)?

@aaronspring
Copy link
Collaborator

The 'query' could be a parameter of an entry. Do you have a working example of how to get ncss into xarray?

@jthielen
Copy link
Author

I always wondered what these different access paths can do.

NCSS is nice when working with limited lat/lon queries, since it performs all the subsetting on the server, and is grid mapping invariant on the user end. You could still use it for larger grids, but since it doesn't do lazy loading, I tend to avoid it for those use cases in favor of opendap (which also has the benefit specifying the URL directly in xr.open_dataset).

On the subject of different access paths, perhaps cdmremote would be worth exploring as well (functions like an opendap alternative, but is more optimized/performant by using ncstream, which is based on protobuf). I haven't used it much myself yet (again, opendap is really easy with xarray), but may investigate using it through siphon for possible performance improvements.

Is this ncss a driver that would be handled here in intake-thredds or could it also go into intake-xarray (more close to xr.open_dataset and a dependency of intake-thredds)?

I would lean towards intake-thredds for two reasons:

  1. NCSS is a feature of THREDDS, so intake-thredds just seems like the natural place,
  2. An NCSS intake driver would likely wrap all the message handling for NCSS in siphon. intake-thredds already has a siphon dependency, and I'm not sure there would be benefit to splitting siphon-wrapping functionality between intake-xarray and intake-thredds.

The 'query' could be a parameter of an entry.

If possible, I think having individual query parameters be parameters of an entry would be a more readable interface, but I'm definitely open to other possibilities!

Do you have a working example of how to get ncss into xarray?

Sure, though in order to highlight a use case where I'd prefer ncss over opendap, it requires a workaround for pydata/xarray#2233

from siphon.catalog import TDSCatalog
import xarray as xr

cat = TDSCatalog("https://thredds.unidata.ucar.edu/thredds/catalog/grib/NCEP/RAP/CONUS_13km/catalog.xml")

ncss = cat.datasets['Latest Collection for Rapid Refresh CONUS 13km'].subset()

query = ncss.query()
query.lonlat_point(-93.64, 42.03)
query.accept('netcdf4')
query.variables(
    'Temperature_isobaric',
    'Relative_humidity_isobaric',
    'u-component_of_wind_isobaric',
    'v-component_of_wind_isobaric'
)

nc_ds = ncss.get_data(query)

# Patch for https://github.com/pydata/xarray/issues/2233
nc_ds.variables['isobaric_coord'] = nc_ds.variables['isobaric']
del nc_ds.variables['isobaric']

ds = xr.open_dataset(xr.backends.NetCDF4DataStore(nc_ds))

# Finish patching for https://github.com/pydata/xarray/issues/2233
ds = ds.squeeze().set_coords('isobaric_coord').rename_vars({'isobaric_coord': 'isobaric'})

ds

@andersy005 andersy005 added this to Xdev Oct 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants