Skip to content

Comments

Migrate from Poetry to uv, bump deps for H100/Python 3.12#115

Open
mihow wants to merge 4 commits intomainfrom
feature/uv-migration
Open

Migrate from Poetry to uv, bump deps for H100/Python 3.12#115
mihow wants to merge 4 commits intomainfrom
feature/uv-migration

Conversation

@mihow
Copy link
Collaborator

@mihow mihow commented Feb 21, 2026

Summary

  • Replace Poetry build system with PEP 621 metadata + hatchling, managed by uv
  • Bump torch>=2.5 and torchvision>=0.20 (required for H100 GPUs on DRAC clusters)
  • Bump pandas>=2.0 and pillow>=10 (required for Python 3.12 compatibility)
  • Widen Python constraint to >=3.10,<3.13
  • Move GUI deps (kivy, plyer) to [project.optional-dependencies] gui
  • Move dev tools to [dependency-groups] dev (PEP 735)
  • Add explicit deps that were previously only transitive: fastapi, uvicorn, requests, numpy
  • Delete poetry.lock and MANIFEST.in, generate uv.lock
  • Update CI workflows to use uv with Python 3.10 + 3.12 test matrix
  • Add example SLURM job script for DRAC Alliance HPC clusters (scripts/job_adc_worker.sh)
  • Update README install instructions for uv

Motivation

The ADC worker needs to run on DRAC Alliance HPC clusters (Fir/Rorqual) with H100 GPUs. torch 2.1.x doesn't support H100 (needs >=2.5), and pandas 1.5.x doesn't build on Python 3.12. Poetry's [tool.poetry] format also prevents uv sync from working directly. This PR resolves all three blockers.

Supersedes #107.

Test plan

  • uv sync resolves and installs (147 packages)
  • uv run ami --help and uv run ami worker --help work
  • uv run pytest — 28 passed, 1 skipped (Python 3.12)
  • uv sync --no-default-groups installs without dev deps
  • uv sync --extra gui installs kivy/plyer
  • CI passes for both Python 3.10 and 3.12

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Tests now run on Python 3.10 and 3.12.
    • Added an HPC batch script for DRAC cluster runs.
  • Documentation

    • Updated setup and testing instructions; recommends uv for environment management.
    • Specifies supported Python range (>=3.10,<3.13).
  • Chores

    • Migrated project packaging/configuration and build backend; updated install/test command flow to use uv.
    • Added/updated project metadata and CLI entry points.

Replace Poetry build system with PEP 621 metadata + hatchling,
managed by uv. Key dependency bumps: torch>=2.5 (H100 support),
pandas>=2.0 and pillow>=10 (Python 3.12 compat). Move GUI deps
(kivy, plyer) to optional extras, dev tools to dependency-groups.

Update CI to use uv with a Python 3.10/3.12 test matrix.
Add example SLURM script for DRAC Alliance HPC clusters.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link

coderabbitai bot commented Feb 21, 2026

Warning

Rate limit exceeded

@mihow has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 4 minutes and 45 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📝 Walkthrough

Walkthrough

Migrates dependency management from Poetry to the uv toolchain and PEP 621/Hatch build config; updates CI workflows to use uv for Python and test orchestration; adjusts packaging/README guidance; adds a SLURM job script; and fixes a small NumPy API usage in a model post-processing function.

Changes

Cohort / File(s) Summary
Build System & Project metadata
pyproject.toml
Replaces Poetry config with PEP 621 [project] table, converts authors/license formats, adds requires-python and urls, moves dependencies to dependencies and project.optional-dependencies, adds project.scripts, and switches build backend to Hatch (hatchling).
CI/CD Workflows
.../.github/workflows/test.yml, .../.github/workflows/test-ml.yml
Replaces Python setup and Poetry usage with astral-sh/setup-uv steps and uv python install/uv sync/uv run commands; test.yml adds a Python matrix (3.10, 3.12); checkout action updated to v4.
Docs & Packaging Manifest
README.md, MANIFEST.in
README updated to require Python >=3.10,<3.13 and to document uv-based developer/test commands; MANIFEST.in removed entries for README.rst and requirements.txt.
HPC Job Script
scripts/job_adc_worker.sh
New SLURM batch script added for ADC Antenna worker: SBATCH directives, strict shell options, module/env setup, .env loading, idempotent worker registration, and timed ami worker execution with graceful shutdown handling.
Model tweak
trapdata/ml/models/tracking.py
Small change: uses np.prod instead of np.product when computing total feature count in post_process_batch (API usage correction).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐇 I hopped from Poetry to uv's bright trail,
Updated scripts, configs, and Hatch for the sail.
CI hums, jobs queue on clusters so grand,
NumPy nibbles fixed by a careful hand.
Carrots and commits—code gardens prevail! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the two main changes: migration from Poetry to uv and bumping dependencies for H100/Python 3.12 support.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/uv-migration

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (6)
.github/workflows/test.yml (1)

16-18: Good: Python version matrix covers the target versions.

Testing on both 3.10 and 3.12 is consistent with the PR goals. Consider adding fail-fast: false to the strategy so a failure on one Python version doesn't cancel the other — useful for catching version-specific issues.

💡 Optional addition
     strategy:
+      fail-fast: false
       matrix:
         python-version: ["3.10", "3.12"]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/test.yml around lines 16 - 18, Add fail-fast: false to the
GitHub Actions job strategy so a failure on one Python matrix entry doesn't
cancel the others; update the strategy block (the same block that contains
matrix and python-version) to include fail-fast: false immediately under
strategy to ensure all matrix jobs run to completion.
.github/workflows/test-ml.yml (1)

12-30: ML pipeline workflow only tests Python 3.10 — consider adding 3.12 to the matrix.

Given that a key motivation for this PR is Python 3.12 + H100 GPU support, the ML pipeline test is only run on 3.10. The unit test workflow (test.yml) already tests both 3.10 and 3.12. If CI resource cost is acceptable, adding a version matrix here would increase confidence that the ML pipeline works on 3.12 — especially since torch/torchvision version bumps can behave differently across Python versions.

💡 Suggested change
 jobs:
   build:
     runs-on: ubuntu-latest
+    strategy:
+      matrix:
+        python-version: ["3.10", "3.12"]

     steps:
       - uses: actions/checkout@v4

       - name: Install uv
         uses: astral-sh/setup-uv@v6

-      - name: Set up Python 3.10
-        run: uv python install 3.10
+      - name: Set up Python ${{ matrix.python-version }}
+        run: uv python install ${{ matrix.python-version }}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/test-ml.yml around lines 12 - 30, Update the ML pipeline
workflow job "build" to run on a Python version matrix that includes 3.10 and
3.12 so the pipeline steps (the "Set up Python 3.10" step that runs `uv python
install 3.10`, the dependency install `uv sync --no-default-groups`, and the
test run `uv run ami test pipeline`) execute for both interpreters; implement
this by replacing the single-version setup with a matrix strategy (e.g., matrix
of python versions) and reference the matrix variable in the install step and
any Python-specific commands so the job runs for each version.
scripts/job_adc_worker.sh (1)

46-46: Consider guarding the .env source for a clearer error message.

With set -euo pipefail, a missing .env will abort the script with a generic "No such file" error. A guard would surface a more actionable message for users who forget the setup step.

💡 Suggested improvement
-set -a; source .env; set +a
+if [[ ! -f .env ]]; then
+    echo "ERROR: .env file not found in $(pwd). See one-time setup instructions." >&2
+    exit 1
+fi
+set -a; source .env; set +a
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/job_adc_worker.sh` at line 46, Guard the "set -a; source .env; set
+a" step by first checking for the existence/readability of the .env file and,
if missing, print a clear actionable error and exit (respecting set -euo
pipefail). Replace the bare "source .env" invocation with a conditional that
tests -f or -r on ".env" and emits a descriptive message like "Missing .env —
please copy .env.example and configure it" before exiting non‑zero, so callers
get a clear failure instead of a generic "No such file" error.
README.md (1)

21-21: Python version guidance is narrower than pyproject.toml allows.

The README says "Python 3.10 or 3.12" but pyproject.toml declares requires-python = ">=3.10,<3.13", which also covers 3.11. This is intentional (matching the CI matrix), but worth making explicit — e.g., "Tested with Python 3.10 and 3.12" — so users on 3.11 know it's not officially tested but not blocked either.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` at line 21, The README line "Requires Python 3.10 or 3.12."
conflicts with pyproject.toml's requires-python = ">=3.10,<3.13"; update the
README to state that Python >=3.10 and <3.13 is supported and clarify that the
project is tested specifically on Python 3.10 and 3.12 (so 3.11 is supported but
not officially tested). Replace the exact text "Requires Python 3.10 or 3.12."
with this clarified phrasing and ensure the change references the same README
entry so it stays consistent with pyproject.toml.
pyproject.toml (2)

11-36: Consider moving gradio and uvicorn to optional dependency groups.

gradio>=4.41 and uvicorn are only used in specific features:

  • Gradio: Only in trapdata/api/demo.py, lazy-loaded by the ami demo command
  • Uvicorn: Lazily imported only in the ami api command and in api.py's if __name__ block

Both can safely move to an optional dependency group (e.g., [project.optional-dependencies] demo or web) to reduce install footprint for headless worker/CLI-only deployments.

fastapi is harder to make optional—it's imported at the module level in cli/base.py for CLASSIFIER_CHOICES, making it a hard dependency. Moving it would require refactoring to decouple constant imports from the FastAPI app definition.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyproject.toml` around lines 11 - 36, Remove "gradio" and "uvicorn" from the
main dependencies array and add them to a new optional dependency group (e.g.,
create [project.optional-dependencies] with a "demo" or "web" extra) so they are
only installed when requested; keep "fastapi" in core since it is imported at
module level in cli/base.py for CLASSIFIER_CHOICES, and ensure any references to
lazy imports (trapdata/api/demo.py, ami demo command, ami api command, and
api.py's __main__ block) continue to work with lazy imports—also update
README/installation docs and CI/test setups to install the extra when running
demo/web-related tests.

13-14: Remove botocore from explicit dependencies — boto3 manages it with strict version constraints.

botocore is not directly imported in the codebase. boto3 declares tight version ranges for botocore (e.g., botocore>=1.42.54,<1.43.0 for each boto3 release) to ensure runtime compatibility. Listing both with >= floors can cause the resolver to pick incompatible combinations. Let boto3 handle botocore automatically.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyproject.toml` around lines 13 - 14, Remove the explicit "botocore>=1.33"
dependency from the pyproject.toml dependencies list so boto3 can manage
botocore's version constraints; specifically delete the "botocore>=1.33" entry
that appears alongside "boto3>=1.33" and then update your lockfile (e.g., run
your project's dependency lock/update command such as poetry lock or pip-compile
/ pip-tools) to refresh resolved versions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.github/workflows/test-ml.yml:
- Around line 12-30: Update the ML pipeline workflow job "build" to run on a
Python version matrix that includes 3.10 and 3.12 so the pipeline steps (the
"Set up Python 3.10" step that runs `uv python install 3.10`, the dependency
install `uv sync --no-default-groups`, and the test run `uv run ami test
pipeline`) execute for both interpreters; implement this by replacing the
single-version setup with a matrix strategy (e.g., matrix of python versions)
and reference the matrix variable in the install step and any Python-specific
commands so the job runs for each version.

In @.github/workflows/test.yml:
- Around line 16-18: Add fail-fast: false to the GitHub Actions job strategy so
a failure on one Python matrix entry doesn't cancel the others; update the
strategy block (the same block that contains matrix and python-version) to
include fail-fast: false immediately under strategy to ensure all matrix jobs
run to completion.

In `@pyproject.toml`:
- Around line 11-36: Remove "gradio" and "uvicorn" from the main dependencies
array and add them to a new optional dependency group (e.g., create
[project.optional-dependencies] with a "demo" or "web" extra) so they are only
installed when requested; keep "fastapi" in core since it is imported at module
level in cli/base.py for CLASSIFIER_CHOICES, and ensure any references to lazy
imports (trapdata/api/demo.py, ami demo command, ami api command, and api.py's
__main__ block) continue to work with lazy imports—also update
README/installation docs and CI/test setups to install the extra when running
demo/web-related tests.
- Around line 13-14: Remove the explicit "botocore>=1.33" dependency from the
pyproject.toml dependencies list so boto3 can manage botocore's version
constraints; specifically delete the "botocore>=1.33" entry that appears
alongside "boto3>=1.33" and then update your lockfile (e.g., run your project's
dependency lock/update command such as poetry lock or pip-compile / pip-tools)
to refresh resolved versions.

In `@README.md`:
- Line 21: The README line "Requires Python 3.10 or 3.12." conflicts with
pyproject.toml's requires-python = ">=3.10,<3.13"; update the README to state
that Python >=3.10 and <3.13 is supported and clarify that the project is tested
specifically on Python 3.10 and 3.12 (so 3.11 is supported but not officially
tested). Replace the exact text "Requires Python 3.10 or 3.12." with this
clarified phrasing and ensure the change references the same README entry so it
stays consistent with pyproject.toml.

In `@scripts/job_adc_worker.sh`:
- Line 46: Guard the "set -a; source .env; set +a" step by first checking for
the existence/readability of the .env file and, if missing, print a clear
actionable error and exit (respecting set -euo pipefail). Replace the bare
"source .env" invocation with a conditional that tests -f or -r on ".env" and
emits a descriptive message like "Missing .env — please copy .env.example and
configure it" before exiting non‑zero, so callers get a clear failure instead of
a generic "No such file" error.

mihow and others added 2 commits February 20, 2026 23:55
- Replace np.product() with np.prod() in tracking.py (removed in numpy 2.0)
- Add fail-fast: false to test.yml CI matrix
- Add .env guard in SLURM script for clearer error message
- Remove explicit botocore dep (managed by boto3)
- Clarify Python version wording in README

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The ML pipeline test downloads ~500MB of model weights on every run.
Cache ~/.config/AMI/models between runs with a static key (model URLs
are hardcoded in source).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
scripts/job_adc_worker.sh (1)

52-52: Suppress SC1090 with a ShellCheck directive.

🔧 Proposed fix
-source ~/venvs/adc/bin/activate
+# shellcheck source=/dev/null
+source ~/venvs/adc/bin/activate
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/job_adc_worker.sh` at line 52, Suppress ShellCheck SC1090 for the
virtualenv activation line by adding a ShellCheck directive immediately above
the `source ~/venvs/adc/bin/activate` command (e.g., `# shellcheck
source=/dev/null` or `# shellcheck disable=SC1090`) so the linter knows this
external file inclusion is intentional without changing the `source` call.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/job_adc_worker.sh`:
- Line 42: The script enables strict errexit via set -euo pipefail which causes
timeout's normal exit code 124 to mark the job as FAILED; update each call to
the timeout command (the timeout invocations at the current locations around
lines with timeout) to append || [[ $? -eq 124 ]] so that an expected timeout
(exit 124) is treated as a clean exit while other non-zero codes still fail;
apply this change to both timeout usages mentioned (the timeout invocation near
the set -euo pipefail and the second timeout around line 65) ensuring you only
modify the timeout call itself (no change to set -euo pipefail).
- Line 39: The commented example in scripts/job_adc_worker.sh incorrectly passes
a positional service name to the CLI command `ami worker register`, which fails
because the `register` handler in `trapdata/cli/worker.py` only accepts the
optional `--project` flag and reads the service name from the
AMI_ANTENNA_SERVICE_NAME environment variable; remove the positional argument
(e.g. `"DRAC Worker"` or `"DRAC Worker - $(hostname)"`) from the commented `ami
worker register` invocation and update the comment to instruct setting
AMI_ANTENNA_SERVICE_NAME in the environment instead (the registration logic will
append the hostname automatically).

---

Nitpick comments:
In `@scripts/job_adc_worker.sh`:
- Line 52: Suppress ShellCheck SC1090 for the virtualenv activation line by
adding a ShellCheck directive immediately above the `source
~/venvs/adc/bin/activate` command (e.g., `# shellcheck source=/dev/null` or `#
shellcheck disable=SC1090`) so the linter knows this external file inclusion is
intentional without changing the `source` call.

- Remove positional arg from `ami worker register` — service name is
  read from AMI_ANTENNA_SERVICE_NAME env var, not a CLI argument
- Handle timeout exit code 124 as clean exit to avoid SLURM marking
  the job as FAILED on normal time-limit shutdown
- Add AMI_ANTENNA_SERVICE_NAME to .env example in setup comments

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant