-
Notifications
You must be signed in to change notification settings - Fork 0
Initial version #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
c7f71db
fb12747
629ae79
6f7df91
403caf2
066a68f
4774fd1
79b8988
6b9deb1
61ff7c1
d928804
028ef18
9c9c35e
94fd353
45f7ddf
1d71463
64d6cbe
d6a3c5f
2a49699
12b9488
ebc51dd
6b8544a
58721b0
f5c43e5
2d49ad4
1bdbe63
2ec22e3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,16 @@ | ||
| name: Black | ||
|
|
||
| on: push | ||
|
|
||
| jobs: | ||
| black: | ||
|
|
||
| runs-on: ubuntu-24.04 | ||
|
|
||
| steps: | ||
| - uses: actions/checkout@v2 | ||
| - uses: actions/setup-python@v2 | ||
| - uses: psf/black@stable | ||
| with: | ||
| options: ". --check" | ||
| version: "25.9.0" | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,23 @@ | ||
| name: Docker Publishing | ||
|
|
||
| on: | ||
| push: | ||
| branches: | ||
| - '*' | ||
| tags: | ||
| - '[0-9]+.[0-9]+.[0-9]+' | ||
|
|
||
| jobs: | ||
| publish: | ||
|
|
||
| runs-on: ubuntu-24.04 | ||
|
|
||
| steps: | ||
| - uses: actions/checkout@v2 | ||
| - name: Publish to Registry | ||
| uses: docker/build-push-action@v1 | ||
| with: | ||
| username: ${{ secrets.pcicdevops_at_dockerhub_username }} | ||
| password: ${{ secrets.pcicdevops_at_dockerhub_password }} | ||
| repository: pcic/ncpartitioner | ||
| tag_with_ref: true |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,35 @@ | ||
| name: Python CI | ||
|
|
||
| on: push | ||
|
|
||
| jobs: | ||
| test: | ||
|
|
||
| runs-on: ubuntu-22.04 | ||
|
|
||
| steps: | ||
| - name: Checkout | ||
| uses: actions/checkout@v2 | ||
|
|
||
| - name: Set up Python | ||
| uses: actions/setup-python@v2 | ||
| with: | ||
| python-version: 3.9 | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Q already mentioned it but heres a code example from pycds using the python version matrix: https://github.com/pacificclimate/pycds/blob/master/.github/workflows/python-ci.yml#L10 |
||
|
|
||
| - name: Install OS Dependencies | ||
| run: | | ||
| sudo apt update | ||
| sudo apt install python3-dev | ||
| sudo apt-get install nco | ||
| sudo apt-get install curl | ||
|
|
||
| - name: Install Poetry | ||
| run: | | ||
| pip install poetry==2.2 | ||
|
|
||
| - name: Install project | ||
| run: | | ||
| poetry install | ||
|
|
||
| - name: Test with pytest | ||
| run: poetry run pytest | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| FROM python:3.9-slim | ||
|
|
||
| RUN apt-get update && apt-get install -y \ | ||
| nco \ | ||
| curl | ||
|
|
||
| COPY . /app | ||
| WORKDIR /app | ||
|
|
||
| RUN pip install poetry==2.2 | ||
| ENV PATH="/root/.local/bin:$PATH" | ||
| RUN poetry install | ||
|
|
||
| EXPOSE 5000 | ||
| CMD ["poetry", "run", "gunicorn", "--workers=10", "--bind=0.0.0.0:5000", "wsgi:app", "--timeout=300"] |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,60 @@ | ||
| # NCPartitioner | ||
|
|
||
| This container generates user-requested netCDF files using `ncks` and makes them available for download via THREDDS. | ||
|
|
||
| ## Run for Development | ||
|
|
||
| The Unidata netCDF tools must be installed. This package can be installed with poetry: | ||
|
|
||
| ``` | ||
| apt get install nco | ||
| github clone http://github.com/pacificclimate/ncpartitioner | ||
| poetry install | ||
| ``` | ||
|
|
||
| To do end-to-end testing, you will also need a THREDDS instance running on your workstation, though the test suite does not need a working THREDDS instance. Set the environment variables: | ||
|
|
||
| * `OUTPUT_DIR` - file directory to put the partitioned files in. It should be accessible to THREDDS | ||
| * `THREDDS_HTTP_BASE` - the base URL for the THREDDS http server (probably ends /fileserver); a user will be redirected to download the completed file | ||
| * `THREDDS_DAP_BASE` - the base URL for the THREDDS openDAP server (probably ends /dodsC): used to fulfill metadata requests | ||
| * `DATA_ROOT` - directory under which all data is found; prevents files outside the directory from being served | ||
|
|
||
| Run with flask: | ||
| ``` | ||
| poetry run flask run | ||
| ``` | ||
|
|
||
| Run the test suite (environment variables will be provided by pytest and do not need to be set): | ||
| ``` | ||
| poetry run pytest | ||
| ``` | ||
|
|
||
| ## Data assumptions | ||
|
|
||
| This server assumes all files to be downloaded are netCDF4 files with dimensions named `lat`, `lon`, and `time`, and that all variables one might wish to download have those dimensions. Timeless files or station-based geometries cannot be downloaded via this server. Only one variable may be downloaded at a time. | ||
|
|
||
| ## Request format | ||
|
|
||
| Request format is indicated by concatenating an extension onto the `filepath` parameter. Some request formats require an additional `targets` parameter. Request attributes other than `targets` and `filepath` are ignored. | ||
|
|
||
| This server supports four request formats. Three of them are simply redirected to the THREDDS server: | ||
|
|
||
| ### DDS request | ||
| `https://server/partition/?filepath=path/to/file.nc.dds&targets=time` | ||
|
|
||
| Redirects to a THREDDS page displaying metadata about the `time` dimension. This request accepts a single dimension. | ||
|
|
||
| ### DAS request | ||
| `https://server/partition/?filepath=path/to/file.nc.das` | ||
|
|
||
| Redirects to a THREDDS page displaying metadata about all variables and attributes.`targets` attribute is ignored, if present. | ||
|
|
||
| ### ASCII request | ||
| `https://server/partition/?filepath=path/to/file.nc.ascii&targets=lat,lon` | ||
|
|
||
| Redirects to a THREDDS page displaying values for the requested dimension variable(s) in ASCII format. This server will only display values for dimension variables (`lat`, `lon`, and `time`) via this request type. OpenDAP standards support requesting any variable in ASCII format this way, but since THREDDS has a 500MB maximum file size for DAP requests, this server only supports requesting the dimension variables, not multidimensional data variables. | ||
|
|
||
| ### Partition request | ||
| `https://server/partition/?filepath=path/to/file.nc.nc&targets=time[0:10],lat[0:20],lon[0:30],tasmax[0:10][0:20][0:30]` | ||
|
|
||
| Creates a file with the requested dimensions using `ncks`, then redirects the user to the THREDDS page to download the newly created file. Note that the variable is always trimmed to the hyperslab specified in the dimensions portion of the `targets` attribute; if the variable portion of the `targets` attribute is different, it will be overruled. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| from flask import Flask | ||
| import logging | ||
|
|
||
|
|
||
| def create_app(config=None): | ||
| app = Flask(__name__) | ||
| app.config.from_object(config) | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is currently a no-op since we don’t pass a config. It might be worth adding a comment noting that this is intentional and keeps the app factory ready for future configuration. |
||
| app.logger.setLevel(logging.INFO) | ||
|
|
||
| with app.app_context(): | ||
| from .routes import partition | ||
|
|
||
| app.register_blueprint(partition) | ||
|
|
||
| return app | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,77 @@ | ||
| """send responses to user requests. TResponses are always a redirect to a THREDDS-served file. | ||
| In cases of DDS and DAS, the file already exists; for data requests the filemust be created first. | ||
| """ | ||
|
|
||
| from posixpath import dirname | ||
| import subprocess | ||
| import os | ||
| from flask import redirect | ||
| import logging | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| def slice(args): | ||
| output_dir = os.getenv("OUTPUT_DIR") | ||
| thredds_base = os.getenv("THREDDS_HTTP_BASE") | ||
|
|
||
| logger.info(f"Slicing file") | ||
| subprocess.run( | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we add a timeout or capture stderr in case NCO fails or hangs? |
||
| [ | ||
| "ncks", | ||
| "-v", | ||
| f"{args['variable']}", | ||
| "-d", | ||
| f"time,{args['time'][0]},{args['time'][1]}", | ||
| "-d", | ||
| f"lat,{args['lat'][0]},{args['lat'][1]}", | ||
| "-d", | ||
| f"lon,{args['lon'][0]},{args['lon'][1]}", | ||
| f"/{args['dirname']}/{args['basename']}.{args['extension']}", | ||
| os.path.join( | ||
| output_dir, | ||
| f"{args['basename']}_{args['timestamp']}.{args['extension']}", | ||
| ), | ||
| ], | ||
| check=True, | ||
| ) | ||
|
|
||
| output_filename = f"{args['basename']}_{args['timestamp']}.{args['extension']}" | ||
| logger.info( | ||
| f"Slice complete; file saved to {os.path.join(output_dir, output_filename)}" | ||
| ) | ||
| logger.info(f"Sending redirect to {thredds_base}{output_dir}/{output_filename}") | ||
|
|
||
| return redirect(f"{thredds_base}{output_dir}/{output_filename}") | ||
|
|
||
|
|
||
| def dap_filepath(args): | ||
| """construct the filepath for DDS/DAS requests""" | ||
| thredds_base = os.getenv("THREDDS_DAP_BASE") | ||
| return f"{thredds_base}/{args['dirname']}/{args['basename']}.{args['extension']}" | ||
|
|
||
|
|
||
| def dds(args): | ||
| filepath = dap_filepath(args) | ||
| logger.info(f"Received DDS request: filepath={filepath}") | ||
| if "target" in args: | ||
| return redirect(f"{filepath}.dds?{args['target']}") | ||
| return redirect(f"{filepath}.dds") | ||
|
|
||
|
|
||
| def das(args): | ||
| filepath = dap_filepath(args) | ||
| logger.info(f"Received DAS request: filepath={filepath}") | ||
|
|
||
| return redirect(f"{filepath}.das") | ||
|
|
||
|
|
||
| def asc(args): | ||
| # returns requested dimension data in ASCII format; this function does not return gridded data. | ||
| filepath = dap_filepath(args) | ||
| dims = ( | ||
| args["target"] if isinstance(args["target"], str) else ",".join(args["target"]) | ||
| ) | ||
| logger.info(f"Received ASCII request: filepath={filepath}") | ||
|
|
||
| return redirect(f"{filepath}.ascii?{dims}") | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,47 @@ | ||
| from flask import Blueprint, request, redirect | ||
| from ncpartitioner.sanitize import ( | ||
| check_filepath, | ||
| check_targets_slice, | ||
| check_targets_dds, | ||
| check_ranges, | ||
| check_targets_ascii, | ||
| ) | ||
| from ncpartitioner.response import slice, dds, das, asc | ||
| import logging | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
| partition = Blueprint("partition", __name__, url_prefix="/partition") | ||
|
|
||
|
|
||
| @partition.route("/", methods=["GET"]) | ||
| def ncpartitioner(): | ||
| """creates the requested netCDF with NCO, moves it to where THREDDS can serve it, and returns a link to the user""" | ||
| logger.info(f"received request {request.url}") | ||
| filepath = request.args.get("filepath") | ||
| targets = request.args.get("targets", None) | ||
|
|
||
| try: | ||
| args = check_filepath(filepath) | ||
| except ValueError as ve: | ||
| logger.error(f"Input error: {ve}") | ||
| return f"Input error: {ve}", 400 | ||
|
|
||
| if args["request_format"] == "dds": | ||
| args.update(check_targets_dds(targets, args)) | ||
| return dds(args) | ||
| elif args["request_format"] == "das": | ||
| return das(args) | ||
| elif args["request_format"] == "nc": | ||
| try: | ||
| args.update(check_targets_slice(targets)) | ||
| check_ranges(args) | ||
| except ValueError as ve: | ||
| logger.error(f"Input error: {ve}") | ||
| return f"Input error: {ve}", 400 | ||
|
|
||
| logger.info(f"Received slice request: filepath={filepath}, targets={targets}") | ||
| return slice(args) | ||
| elif args["request_format"] in ["ascii", "asc"]: | ||
| args.update(check_targets_ascii(targets)) | ||
| return asc(args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can use a concurrency block to cancel tests if this one fails. It'd look something like