-
Notifications
You must be signed in to change notification settings - Fork 2
Update ec network normalize to support variable subsitutions #182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Nospamas
wants to merge
5
commits into
master
Choose a base branch
from
ec-network-variable-sub
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 3 commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
beb9525
Update ec network normalize to support variable subsitutions
Nospamas 068e243
Fix CI timestamps
Nospamas aeee75c
black project
Nospamas e88a82b
refactor to move variable substitution functions to helpers
Nospamas d807b30
Add bulk download and process utilities to scripts
Nospamas File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| # See here for image contents: https://github.com/microsoft/vscode-dev-containers/tree/v0.245.0/containers/python-3/.devcontainer/base.Dockerfile | ||
|
|
||
|
|
||
| # [Choice] Ubuntu version (use ubuntu-22.04 or ubuntu-18.04 on local arm64/Apple Silicon): ubuntu-22.04, ubuntu-20.04, ubuntu-18.04 | ||
| ARG VARIANT=ubuntu-24.04 | ||
| FROM mcr.microsoft.com/vscode/devcontainers/base:${VARIANT} | ||
|
|
||
| # Postgres & our packages. Currently not customizable via VERSION param. | ||
| # RUN apt-get update \ | ||
| # && apt-get -y install --no-install-recommends curl ca-certificates gnupg | ||
| RUN curl https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add - | ||
| RUN sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list' | ||
| RUN apt-get update \ | ||
| && apt-get -y install --no-install-recommends postgresql-plpython3-14 postgresql-14-postgis-3 libpq-dev |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,67 @@ | ||
| // For format details, see https://aka.ms/devcontainer.json. For config options, see the README at: | ||
| // https://github.com/microsoft/vscode-dev-containers/tree/v0.245.0/containers/python-3 | ||
| { | ||
| "name": "Python3 & Poetry & Postgres", | ||
| "build": { | ||
| "dockerfile": "Dockerfile", | ||
| "args": { | ||
| // Update 'VARIANT' to pick a Python version: 3, 3.10, 3.9, 3.8, 3.7, 3.6 | ||
| // Append -bullseye or -buster to pin to an OS version. | ||
| // Use -bullseye variants on local on arm64/Apple Silicon. | ||
| "VARIANT": "ubuntu-24.04" | ||
| } | ||
| }, | ||
| // Configure tool-specific properties. | ||
| "customizations": { | ||
| // Configure properties specific to VS Code. | ||
| "vscode": { | ||
| // Set *default* container specific settings.json values on container create. | ||
| "settings": { | ||
| "python.defaultInterpreterPath": "${workspaceFolder}/.venv/bin/python", | ||
| "python.linting.enabled": true, | ||
| "python.formatting.autopep8Path": "/usr/local/py-utils/bin/autopep8", | ||
| "python.formatting.blackPath": "/usr/local/py-utils/bin/black", | ||
| "python.formatting.yapfPath": "/usr/local/py-utils/bin/yapf", | ||
| "python.linting.banditPath": "/usr/local/py-utils/bin/bandit", | ||
| "python.linting.mypyPath": "/usr/local/py-utils/bin/mypy", | ||
| "python.linting.pycodestylePath": "/usr/local/py-utils/bin/pycodestyle", | ||
| "python.linting.pydocstylePath": "/usr/local/py-utils/bin/pydocstyle", | ||
| "python.linting.pylintPath": "/usr/local/py-utils/bin/pylint" | ||
| }, | ||
| // VSCODE ONLY: Add the IDs of extensions you want installed when the container is created. | ||
| "extensions": [ | ||
| "ms-python.python", | ||
| "ms-python.vscode-pylance" | ||
| ] | ||
| } | ||
| }, | ||
| // Use 'forwardPorts' to make a list of ports inside the container available locally. | ||
| //"forwardPorts": [48423], | ||
| // Use 'postCreateCommand' to run commands after the container is created. | ||
| "postCreateCommand": "bash ./.devcontainer/post-install.sh", | ||
| "postStartCommand": "bash ./.devcontainer/post-start.sh", | ||
| "features": { | ||
| "ghcr.io/devcontainers/features/docker-in-docker:2": "latest", | ||
| "ghcr.io/devcontainers/features/git:1": "latest", | ||
| // add python to container | ||
| "ghcr.io/devcontainers/features/python:1": { | ||
| "version": "3.13" | ||
| }, | ||
| // add poetry to container | ||
| "ghcr.io/devcontainers-extra/features/poetry:2": { | ||
| "version": "2.1.3" | ||
| } | ||
| }, | ||
| // Comment out to connect as root instead. More info: https://aka.ms/vscode-remote/containers/non-root. | ||
| "remoteUser": "vscode", | ||
| "remoteEnv": { | ||
| // "PATH": "${containerEnv:PATH}:${containerEnv:HOME}/.local/bin" | ||
| }, | ||
| "runArgs": [ | ||
| // allow container to be treated with no network isolation | ||
| "--network=host", | ||
| // give a nicer name to the container | ||
| "--name", | ||
| "${localEnv:USER}_crmprtd_devcontainer" | ||
| ] | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| #!/bin/bash | ||
| set -ex | ||
|
|
||
| ## | ||
| ## Create some aliases | ||
| ## | ||
| echo 'alias ll="ls -alF"' >> $HOME/.bashrc | ||
| echo 'alias la="ls -A"' >> $HOME/.bashrc | ||
| echo 'alias l="ls -CF"' >> $HOME/.bashrc | ||
|
|
||
| # Convenience workspace directory for later use | ||
| WORKSPACE_DIR=$(pwd) | ||
|
|
||
| # Change some Poetry settings to better deal with working in a container | ||
| poetry config cache-dir ${WORKSPACE_DIR}/.cache | ||
| poetry config virtualenvs.in-project true | ||
|
|
||
| # Now install all dependencies | ||
| poetry install --all-extras | ||
|
|
||
| echo "Done!" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| #!/bin/bash | ||
| set -ex | ||
|
|
||
| # Convenience workspace directory for later use | ||
| WORKSPACE_DIR=$(pwd) | ||
|
|
||
| # # Set current workspace as safe for git | ||
| # git config --global --add safe.directory ${WORKSPACE_DIR} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,22 @@ | ||
| { | ||
| // Use IntelliSense to learn about possible attributes. | ||
| // Hover to view descriptions of existing attributes. | ||
| // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387 | ||
| "version": "0.2.0", | ||
| "configurations": [ | ||
| { | ||
| "name": "Pytest: Current File", | ||
| "type": "debugpy", | ||
| "request": "launch", | ||
| "module": "pytest", | ||
| "console": "integratedTerminal", | ||
| "cwd": "${workspaceFolder}", | ||
| "justMyCode": false, | ||
| "env": { | ||
| }, | ||
| "args": [ | ||
| "${file}" | ||
| ] | ||
| } | ||
| ] | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,6 +1,7 @@ | ||
| """ | ||
| Some additional iteration tools | ||
| """ | ||
|
|
||
| from itertools import islice, cycle | ||
|
|
||
|
|
||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,6 +1,7 @@ | ||
| """ | ||
| A test downloader that does nothing. | ||
| """ | ||
|
|
||
| import logging | ||
| import os | ||
| from argparse import ArgumentParser | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,7 +1,41 @@ | ||
| import logging | ||
| import yaml | ||
|
|
||
| from importlib.resources import files | ||
| from crmprtd import Row | ||
| from crmprtd.swob_ml import normalize as swob_ml_normalize | ||
|
|
||
| log = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| def normalize(file_stream): | ||
| return swob_ml_normalize( | ||
|
|
||
| variable_substitutions_path = "networks/ec/variable_substitutions.yaml" | ||
| try: | ||
| with (files("crmprtd") / variable_substitutions_path).open("rb") as f: | ||
| variable_substitutions = yaml.safe_load(f) | ||
| except FileNotFoundError: | ||
| log.warning( | ||
| f"Cannot open resource file '{variable_substitutions_path}'. " | ||
| f"Proceeding with normalization, but there's a risk that variable names will not be recognized." | ||
| ) | ||
| return | ||
|
|
||
| rows = swob_ml_normalize( | ||
| file_stream, "EC_raw", station_id_attr="climate_station_number" | ||
| ) | ||
|
|
||
| for row in rows: | ||
| if row.variable_name in variable_substitutions: | ||
| yield Row( | ||
| time=row.time, | ||
| val=row.val, | ||
| variable_name=variable_substitutions[row.variable_name], | ||
| unit=row.unit, | ||
| network_name=row.network_name, | ||
| station_id=row.station_id, | ||
| lat=row.lat, | ||
| lon=row.lon, | ||
| ) | ||
| else: | ||
| yield row | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,10 @@ | ||
| # Defines a mapping between variable names given to us by EC | ||
| # c.a. 2022 and what the variables were named in the PCDS (a.k.a. their | ||
| # "historic" name) | ||
| # Values should be of the form: "name_in_near_real_time_feed": "net_var_name-in-pcds" | ||
|
|
||
| 'air_temperature_yesterday_high': 'air_temperature' | ||
| 'air_temperature_yesterday_low': 'air_temperature' | ||
| 'total_precipitation': 'total_precipitation' | ||
| 'wind_direction': 'wind_from_direction' | ||
| 'wind_gust_speed': 'wind_speed' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm always skeptical when a patch adds LOCs to our repo. :) The approach you've taken here is reasonable, but I don't think that this is (or will be) specific to ECCC. Could you take what you've done here and incorporate it into the
swob_mlmodule? Then if other networks change their variable names in the future, we will have an easy place to add variable mappings.