Migrate from Poetry to uv, bump deps for H100/Python 3.12#115
Migrate from Poetry to uv, bump deps for H100/Python 3.12#115
Conversation
Replace Poetry build system with PEP 621 metadata + hatchling, managed by uv. Key dependency bumps: torch>=2.5 (H100 support), pandas>=2.0 and pillow>=10 (Python 3.12 compat). Move GUI deps (kivy, plyer) to optional extras, dev tools to dependency-groups. Update CI to use uv with a Python 3.10/3.12 test matrix. Add example SLURM script for DRAC Alliance HPC clusters. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📝 WalkthroughWalkthroughMigrates dependency management from Poetry to the uv toolchain and PEP 621/Hatch build config; updates CI workflows to use uv for Python and test orchestration; adjusts packaging/README guidance; adds a SLURM job script; and fixes a small NumPy API usage in a model post-processing function. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (6)
.github/workflows/test.yml (1)
16-18: Good: Python version matrix covers the target versions.Testing on both 3.10 and 3.12 is consistent with the PR goals. Consider adding
fail-fast: falseto the strategy so a failure on one Python version doesn't cancel the other — useful for catching version-specific issues.💡 Optional addition
strategy: + fail-fast: false matrix: python-version: ["3.10", "3.12"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/test.yml around lines 16 - 18, Add fail-fast: false to the GitHub Actions job strategy so a failure on one Python matrix entry doesn't cancel the others; update the strategy block (the same block that contains matrix and python-version) to include fail-fast: false immediately under strategy to ensure all matrix jobs run to completion..github/workflows/test-ml.yml (1)
12-30: ML pipeline workflow only tests Python 3.10 — consider adding 3.12 to the matrix.Given that a key motivation for this PR is Python 3.12 + H100 GPU support, the ML pipeline test is only run on 3.10. The unit test workflow (
test.yml) already tests both 3.10 and 3.12. If CI resource cost is acceptable, adding a version matrix here would increase confidence that the ML pipeline works on 3.12 — especially since torch/torchvision version bumps can behave differently across Python versions.💡 Suggested change
jobs: build: runs-on: ubuntu-latest + strategy: + matrix: + python-version: ["3.10", "3.12"] steps: - uses: actions/checkout@v4 - name: Install uv uses: astral-sh/setup-uv@v6 - - name: Set up Python 3.10 - run: uv python install 3.10 + - name: Set up Python ${{ matrix.python-version }} + run: uv python install ${{ matrix.python-version }}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/test-ml.yml around lines 12 - 30, Update the ML pipeline workflow job "build" to run on a Python version matrix that includes 3.10 and 3.12 so the pipeline steps (the "Set up Python 3.10" step that runs `uv python install 3.10`, the dependency install `uv sync --no-default-groups`, and the test run `uv run ami test pipeline`) execute for both interpreters; implement this by replacing the single-version setup with a matrix strategy (e.g., matrix of python versions) and reference the matrix variable in the install step and any Python-specific commands so the job runs for each version.scripts/job_adc_worker.sh (1)
46-46: Consider guarding the.envsource for a clearer error message.With
set -euo pipefail, a missing.envwill abort the script with a generic "No such file" error. A guard would surface a more actionable message for users who forget the setup step.💡 Suggested improvement
-set -a; source .env; set +a +if [[ ! -f .env ]]; then + echo "ERROR: .env file not found in $(pwd). See one-time setup instructions." >&2 + exit 1 +fi +set -a; source .env; set +a🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/job_adc_worker.sh` at line 46, Guard the "set -a; source .env; set +a" step by first checking for the existence/readability of the .env file and, if missing, print a clear actionable error and exit (respecting set -euo pipefail). Replace the bare "source .env" invocation with a conditional that tests -f or -r on ".env" and emits a descriptive message like "Missing .env — please copy .env.example and configure it" before exiting non‑zero, so callers get a clear failure instead of a generic "No such file" error.README.md (1)
21-21: Python version guidance is narrower thanpyproject.tomlallows.The README says "Python 3.10 or 3.12" but
pyproject.tomldeclaresrequires-python = ">=3.10,<3.13", which also covers 3.11. This is intentional (matching the CI matrix), but worth making explicit — e.g., "Tested with Python 3.10 and 3.12" — so users on 3.11 know it's not officially tested but not blocked either.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@README.md` at line 21, The README line "Requires Python 3.10 or 3.12." conflicts with pyproject.toml's requires-python = ">=3.10,<3.13"; update the README to state that Python >=3.10 and <3.13 is supported and clarify that the project is tested specifically on Python 3.10 and 3.12 (so 3.11 is supported but not officially tested). Replace the exact text "Requires Python 3.10 or 3.12." with this clarified phrasing and ensure the change references the same README entry so it stays consistent with pyproject.toml.pyproject.toml (2)
11-36: Consider movinggradioanduvicornto optional dependency groups.
gradio>=4.41anduvicornare only used in specific features:
- Gradio: Only in
trapdata/api/demo.py, lazy-loaded by theami democommand- Uvicorn: Lazily imported only in the
ami apicommand and inapi.py'sif __name__blockBoth can safely move to an optional dependency group (e.g.,
[project.optional-dependencies] demoorweb) to reduce install footprint for headless worker/CLI-only deployments.
fastapiis harder to make optional—it's imported at the module level incli/base.pyforCLASSIFIER_CHOICES, making it a hard dependency. Moving it would require refactoring to decouple constant imports from the FastAPI app definition.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pyproject.toml` around lines 11 - 36, Remove "gradio" and "uvicorn" from the main dependencies array and add them to a new optional dependency group (e.g., create [project.optional-dependencies] with a "demo" or "web" extra) so they are only installed when requested; keep "fastapi" in core since it is imported at module level in cli/base.py for CLASSIFIER_CHOICES, and ensure any references to lazy imports (trapdata/api/demo.py, ami demo command, ami api command, and api.py's __main__ block) continue to work with lazy imports—also update README/installation docs and CI/test setups to install the extra when running demo/web-related tests.
13-14: Removebotocorefrom explicit dependencies —boto3manages it with strict version constraints.
botocoreis not directly imported in the codebase.boto3declares tight version ranges forbotocore(e.g.,botocore>=1.42.54,<1.43.0for each boto3 release) to ensure runtime compatibility. Listing both with>=floors can cause the resolver to pick incompatible combinations. Letboto3handlebotocoreautomatically.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pyproject.toml` around lines 13 - 14, Remove the explicit "botocore>=1.33" dependency from the pyproject.toml dependencies list so boto3 can manage botocore's version constraints; specifically delete the "botocore>=1.33" entry that appears alongside "boto3>=1.33" and then update your lockfile (e.g., run your project's dependency lock/update command such as poetry lock or pip-compile / pip-tools) to refresh resolved versions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In @.github/workflows/test-ml.yml:
- Around line 12-30: Update the ML pipeline workflow job "build" to run on a
Python version matrix that includes 3.10 and 3.12 so the pipeline steps (the
"Set up Python 3.10" step that runs `uv python install 3.10`, the dependency
install `uv sync --no-default-groups`, and the test run `uv run ami test
pipeline`) execute for both interpreters; implement this by replacing the
single-version setup with a matrix strategy (e.g., matrix of python versions)
and reference the matrix variable in the install step and any Python-specific
commands so the job runs for each version.
In @.github/workflows/test.yml:
- Around line 16-18: Add fail-fast: false to the GitHub Actions job strategy so
a failure on one Python matrix entry doesn't cancel the others; update the
strategy block (the same block that contains matrix and python-version) to
include fail-fast: false immediately under strategy to ensure all matrix jobs
run to completion.
In `@pyproject.toml`:
- Around line 11-36: Remove "gradio" and "uvicorn" from the main dependencies
array and add them to a new optional dependency group (e.g., create
[project.optional-dependencies] with a "demo" or "web" extra) so they are only
installed when requested; keep "fastapi" in core since it is imported at module
level in cli/base.py for CLASSIFIER_CHOICES, and ensure any references to lazy
imports (trapdata/api/demo.py, ami demo command, ami api command, and api.py's
__main__ block) continue to work with lazy imports—also update
README/installation docs and CI/test setups to install the extra when running
demo/web-related tests.
- Around line 13-14: Remove the explicit "botocore>=1.33" dependency from the
pyproject.toml dependencies list so boto3 can manage botocore's version
constraints; specifically delete the "botocore>=1.33" entry that appears
alongside "boto3>=1.33" and then update your lockfile (e.g., run your project's
dependency lock/update command such as poetry lock or pip-compile / pip-tools)
to refresh resolved versions.
In `@README.md`:
- Line 21: The README line "Requires Python 3.10 or 3.12." conflicts with
pyproject.toml's requires-python = ">=3.10,<3.13"; update the README to state
that Python >=3.10 and <3.13 is supported and clarify that the project is tested
specifically on Python 3.10 and 3.12 (so 3.11 is supported but not officially
tested). Replace the exact text "Requires Python 3.10 or 3.12." with this
clarified phrasing and ensure the change references the same README entry so it
stays consistent with pyproject.toml.
In `@scripts/job_adc_worker.sh`:
- Line 46: Guard the "set -a; source .env; set +a" step by first checking for
the existence/readability of the .env file and, if missing, print a clear
actionable error and exit (respecting set -euo pipefail). Replace the bare
"source .env" invocation with a conditional that tests -f or -r on ".env" and
emits a descriptive message like "Missing .env — please copy .env.example and
configure it" before exiting non‑zero, so callers get a clear failure instead of
a generic "No such file" error.
- Replace np.product() with np.prod() in tracking.py (removed in numpy 2.0) - Add fail-fast: false to test.yml CI matrix - Add .env guard in SLURM script for clearer error message - Remove explicit botocore dep (managed by boto3) - Clarify Python version wording in README Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The ML pipeline test downloads ~500MB of model weights on every run. Cache ~/.config/AMI/models between runs with a static key (model URLs are hardcoded in source). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
scripts/job_adc_worker.sh (1)
52-52: Suppress SC1090 with a ShellCheck directive.🔧 Proposed fix
-source ~/venvs/adc/bin/activate +# shellcheck source=/dev/null +source ~/venvs/adc/bin/activate🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/job_adc_worker.sh` at line 52, Suppress ShellCheck SC1090 for the virtualenv activation line by adding a ShellCheck directive immediately above the `source ~/venvs/adc/bin/activate` command (e.g., `# shellcheck source=/dev/null` or `# shellcheck disable=SC1090`) so the linter knows this external file inclusion is intentional without changing the `source` call.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/job_adc_worker.sh`:
- Line 42: The script enables strict errexit via set -euo pipefail which causes
timeout's normal exit code 124 to mark the job as FAILED; update each call to
the timeout command (the timeout invocations at the current locations around
lines with timeout) to append || [[ $? -eq 124 ]] so that an expected timeout
(exit 124) is treated as a clean exit while other non-zero codes still fail;
apply this change to both timeout usages mentioned (the timeout invocation near
the set -euo pipefail and the second timeout around line 65) ensuring you only
modify the timeout call itself (no change to set -euo pipefail).
- Line 39: The commented example in scripts/job_adc_worker.sh incorrectly passes
a positional service name to the CLI command `ami worker register`, which fails
because the `register` handler in `trapdata/cli/worker.py` only accepts the
optional `--project` flag and reads the service name from the
AMI_ANTENNA_SERVICE_NAME environment variable; remove the positional argument
(e.g. `"DRAC Worker"` or `"DRAC Worker - $(hostname)"`) from the commented `ami
worker register` invocation and update the comment to instruct setting
AMI_ANTENNA_SERVICE_NAME in the environment instead (the registration logic will
append the hostname automatically).
---
Nitpick comments:
In `@scripts/job_adc_worker.sh`:
- Line 52: Suppress ShellCheck SC1090 for the virtualenv activation line by
adding a ShellCheck directive immediately above the `source
~/venvs/adc/bin/activate` command (e.g., `# shellcheck source=/dev/null` or `#
shellcheck disable=SC1090`) so the linter knows this external file inclusion is
intentional without changing the `source` call.
- Remove positional arg from `ami worker register` — service name is read from AMI_ANTENNA_SERVICE_NAME env var, not a CLI argument - Handle timeout exit code 124 as clean exit to avoid SLURM marking the job as FAILED on normal time-limit shutdown - Add AMI_ANTENNA_SERVICE_NAME to .env example in setup comments Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
>=3.10,<3.13[project.optional-dependencies] gui[dependency-groups] dev(PEP 735)poetry.lockandMANIFEST.in, generateuv.lockscripts/job_adc_worker.sh)Motivation
The ADC worker needs to run on DRAC Alliance HPC clusters (Fir/Rorqual) with H100 GPUs. torch 2.1.x doesn't support H100 (needs >=2.5), and pandas 1.5.x doesn't build on Python 3.12. Poetry's
[tool.poetry]format also preventsuv syncfrom working directly. This PR resolves all three blockers.Supersedes #107.
Test plan
uv syncresolves and installs (147 packages)uv run ami --helpanduv run ami worker --helpworkuv run pytest— 28 passed, 1 skipped (Python 3.12)uv sync --no-default-groupsinstalls without dev depsuv sync --extra guiinstalls kivy/plyer🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation
Chores