Skip to content

Send processing_service_name in requests to antenna#108

Merged
mihow merged 6 commits intoRolnickLab:mainfrom
carlosgjs:carlosg/cfgname
Feb 19, 2026
Merged

Send processing_service_name in requests to antenna#108
mihow merged 6 commits intoRolnickLab:mainfrom
carlosgjs:carlosg/cfgname

Conversation

@carlosgjs
Copy link
Collaborator

@carlosgjs carlosgjs commented Feb 4, 2026

Summary

Adds a configurable processing_service_name setting (default: "AMI Data Companion") that is sent as a query parameter to Antenna job endpoints. This allows Antenna to log which processing service is making requests, for auditing and debugging.

The service name is read from settings/.env and propagated through job fetching, data loading, and result submission.

Companion PR

Server-side changes: RolnickLab/antenna#1117

Related Issues

Support for RolnickLab/antenna#1112
Support for RolnickLab/antenna#1087

Summary by CodeRabbit

  • New Features

    • Added a configurable Antenna service name (default: "AMI Data Companion") in settings and .env example; the effective name is logged at startup and propagated through job fetching, data loading, and result submission.
  • Bug Fixes

    • Validation now rejects empty or blank service names with clearer guidance for configuration.
  • Chores

    • Register command now reads the service name from application settings instead of requiring a command-line argument.
  • Tests

    • Updated tests to account for the propagated service name in worker flows.

@coderabbitai
Copy link

coderabbitai bot commented Feb 4, 2026

Warning

Rate limit exceeded

@carlosgjs has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 24 minutes and 56 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📝 Walkthrough

Walkthrough

Adds a configurable antenna service name (antenna_service_name / AMI_ANTENNA_SERVICE_NAME) and a helper to produce a normalized full service name (including hostname); the parameter is validated and propagated through settings, CLI register, antenna client, datasets, worker, and registration flows.

Changes

Cohort / File(s) Summary
Configuration & Settings
\.env.example, trapdata/settings.py
Added AMI_ANTENNA_SERVICE_NAME to .env.example and new antenna_service_name on Settings (default "AMI Data Companion") with UI metadata.
Antenna Client
trapdata/antenna/client.py
Added get_full_service_name(service_name: str) -> str (uses hostname). Extended get_jobs() and post_batch_results() signatures to accept processing_service_name and include it as a query parameter.
Antenna Data Loading
trapdata/antenna/datasets.py
RESTDataset.__init__() and get_rest_dataloader() now accept and store/forward processing_service_name; _fetch_tasks includes it in request query params.
Antenna Registration
trapdata/antenna/registration.py
Replaced direct socket.gethostname() usage with get_full_service_name(); tightened validation to reject empty/blank service_name and produce clearer error guidance.
Worker Job Processing
trapdata/antenna/worker.py
Validate/read antenna_service_name, compute full service name via get_full_service_name, log it, and thread processing_service_name through get_jobs(), dataloader creation, _process_job() (signature updated), and post_batch_results().
CLI Worker Command & Tests
trapdata/cli/worker.py, trapdata/antenna/tests/test_worker.py
register() CLI command no longer accepts a name argument and now reads settings.antenna_service_name; tests updated to pass the new service/worker name argument to get_jobs() and _process_job().

Sequence Diagram(s)

sequenceDiagram
    participant CLI as CLI (register)
    participant Settings as Settings
    participant Registration as Antenna Registration
    participant Worker as Worker
    participant Client as Antenna Client
    participant REST as REST API

    CLI->>Settings: read_settings()
    Settings-->>CLI: antenna_service_name
    CLI->>Registration: register_pipelines(service_name)
    Registration->>Client: get_full_service_name(service_name)
    Client-->>Registration: full_service_name
    Registration->>Worker: start with processing_service_name
    Worker->>Client: get_jobs(pipeline, processing_service_name)
    Client->>REST: GET /jobs?processing_service_name=...
    REST-->>Client: jobs
    Client-->>Worker: jobs
    Worker->>REST: fetch batch via RESTDataset (processing_service_name)
    REST-->>Worker: tasks
    Worker->>Client: post_batch_results(job_id, results, processing_service_name)
    Client->>REST: POST /jobs/{id}/results?processing_service_name=...
    REST-->>Client: ack
    Client-->>Worker: success/failure
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I tuck a name into my pouch tonight,
"AMI Data Companion" shining bright.
From settings to worker, through API lanes,
I carry the name across hops and trains.
Hooray — I humberskip with tiny delights! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title accurately describes the main change: introducing and sending a processing_service_name parameter throughout the antenna request workflow.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
trapdata/antenna/client.py (1)

25-74: ⚠️ Potential issue | 🔴 Critical

Update test call sites to include the required processing_service_name parameter.

Both get_jobs and post_batch_results now require processing_service_name. The production calls in trapdata/antenna/worker.py are correctly updated, but two test calls in trapdata/antenna/tests/test_worker.py are missing this parameter:

  • Line 197: get_jobs("http://testserver/api/v2", "test-token", "moths_2024")
  • Line 421: get_jobs("http://testserver/api/v2", "test-token", pipeline_slug,)

Both need to include processing_service_name as the fourth argument.

🤖 Fix all issues with AI agents
In @.env.example:
- Line 17: The AMI_ANTENNA_SERVICE_NAME value in .env.example contains spaces
and should be quoted to avoid dotenv parsing issues; update the
AMI_ANTENNA_SERVICE_NAME entry so its value is wrapped in quotes (e.g., "AMI
Data Companion") to satisfy dotenv-linter and ensure correct parsing by dotenv
parsers.

In `@trapdata/antenna/datasets.py`:
- Around line 89-93: The params dict in the method building the request (using
url = f"{self.base_url.rstrip('/')}/jobs/{self.job_id}/tasks") always includes
"processing_service_name" even when empty; change the code that constructs
params (where "batch": self.batch_size and "processing_service_name":
self.processing_service_name are set) to only add the "processing_service_name"
key when self.processing_service_name is truthy (non-empty) so the request does
not send an empty-string filter to the API.

Copy link
Collaborator

@mihow mihow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it!

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
trapdata/antenna/tests/test_worker.py (1)

222-224: Mock server silently drops processing_service_name — tests don't validate the parameter is forwarded.

The get_jobs endpoint in antenna_api_server.py (lines 29-43) only declares (pipeline__slug, ids_only, incomplete_only). FastAPI ignores undeclared query parameters, so the new processing_service_name query param sent by the client is silently discarded. The tests will continue to pass even if the client stops sending the parameter entirely.

Consider adding processing_service_name: str = "" to the mock server's get_jobs signature and asserting on it in test_returns_job_ids (and the end-to-end test). This would make the test an actual contract test for the new parameter.

♻️ Suggested update to the mock server endpoint
-def get_jobs(pipeline__slug: str, ids_only: int, incomplete_only: int):
+def get_jobs(
+    pipeline__slug: str,
+    ids_only: int,
+    incomplete_only: int,
+    processing_service_name: str = "",
+):

Then in test_returns_job_ids:

         with patch_antenna_api_requests(self.antenna_client):
             result = get_jobs(
                 "http://testserver/api/v2", "test-token", "moths_2024", "Test Worker"
             )
 
+        # Optionally: assert the mock captured the service name
         assert result == [10, 20, 30]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@trapdata/antenna/tests/test_worker.py` around lines 222 - 224, The mock
server's get_jobs handler is dropping the processing_service_name query
parameter; update the get_jobs function signature to accept
processing_service_name: str = "" and then add assertions in
test_returns_job_ids (and the end-to-end test) to verify the handler received
the expected value (e.g., "Test Worker") so the tests validate that the client
forwards processing_service_name; update any call sites or test setup to pass
the expected string and fail the test if the parameter is absent or different.
trapdata/antenna/worker.py (2)

50-52: Duplicate get_full_service_name call and log line in single-process path.

In the CPU / single-GPU code path, run_worker computes full_service_name and logs it at line 52, then immediately calls _worker_loop, which calls read_settings() again and logs the same message at line 91 — two identical log lines in the same process, which can confuse operators.

full_service_name is a plain str and is picklable, so it can be passed through mp.spawn's args. This also eliminates the minor re-read window where antenna_service_name could theoretically differ between the run_worker validation and the child-process read_settings().

♻️ Suggested refactor
 def run_worker(pipelines: list[str]):
     ...
     # Build full service name with hostname
     full_service_name = get_full_service_name(settings.antenna_service_name)
     logger.info(f"Running worker as: {full_service_name}")
 
     gpu_count = torch.cuda.device_count()
 
     if gpu_count > 1:
         logger.info(f"Found {gpu_count} GPUs, spawning one AMI worker instance per GPU")
         mp.spawn(
             _worker_loop,
-            args=(pipelines,),
+            args=(pipelines, full_service_name),
             nprocs=gpu_count,
             join=True,
         )
     else:
         ...
-        _worker_loop(0, pipelines)
+        _worker_loop(0, pipelines, full_service_name)
 
 
-def _worker_loop(gpu_id: int, pipelines: list[str]):
+def _worker_loop(gpu_id: int, pipelines: list[str], full_service_name: str):
     settings = read_settings()
     ...
-    # Build full service name with hostname
-    full_service_name = get_full_service_name(settings.antenna_service_name)
-    logger.info(f"Running worker as: {full_service_name}")
+    logger.info(f"[GPU {gpu_id}] Running worker as: {full_service_name}")
 
     while True:
         ...

Also applies to: 89-91

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@trapdata/antenna/worker.py` around lines 50 - 52, run_worker currently
computes full_service_name via
get_full_service_name(settings.antenna_service_name), logs it, then calls
mp.spawn which starts _worker_loop that calls read_settings() and logs the same
full_service_name again; remove the duplicate by computing full_service_name
once in run_worker and pass that string into mp.spawn via the args so child
processes use the provided full_service_name instead of calling read_settings()
again, and delete the redundant logger.info in _worker_loop (or conditionalize
it) so only the initial process logs the service name; update references to
get_full_service_name, run_worker, _worker_loop, and read_settings accordingly.

43-48: Ruff TRY003 — long inline message; consider a custom exception or a shorter message.

The static analysis tool flags that exception messages this long should live in a custom exception class. In "chill" mode this is a low-priority nit, but it's an easy clean-up.

♻️ Suggested quick fix (shorter message)
-    if not settings.antenna_service_name or not settings.antenna_service_name.strip():
-        raise ValueError(
-            "AMI_ANTENNA_SERVICE_NAME configuration setting must be set. "
-            "Configure it via environment variable or .env file."
-        )
+    if not settings.antenna_service_name or not settings.antenna_service_name.strip():
+        raise ValueError("AMI_ANTENNA_SERVICE_NAME must be set (env var or .env file).")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@trapdata/antenna/worker.py` around lines 43 - 48, Replace the long inline
ValueError message raised when validating settings.antenna_service_name in
worker.py with either a shorter, single-line message (e.g.,
"AMI_ANTENNA_SERVICE_NAME must be set") or raise a custom exception type (e.g.,
AntennaConfigError) defined near the top of the module; update the raise site to
use the new exception class or shorter text and add the custom exception class
definition (subclassing ValueError) so the intent remains clear while satisfying
TRY003.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@trapdata/antenna/tests/test_worker.py`:
- Around line 222-224: The mock server's get_jobs handler is dropping the
processing_service_name query parameter; update the get_jobs function signature
to accept processing_service_name: str = "" and then add assertions in
test_returns_job_ids (and the end-to-end test) to verify the handler received
the expected value (e.g., "Test Worker") so the tests validate that the client
forwards processing_service_name; update any call sites or test setup to pass
the expected string and fail the test if the parameter is absent or different.

In `@trapdata/antenna/worker.py`:
- Around line 50-52: run_worker currently computes full_service_name via
get_full_service_name(settings.antenna_service_name), logs it, then calls
mp.spawn which starts _worker_loop that calls read_settings() and logs the same
full_service_name again; remove the duplicate by computing full_service_name
once in run_worker and pass that string into mp.spawn via the args so child
processes use the provided full_service_name instead of calling read_settings()
again, and delete the redundant logger.info in _worker_loop (or conditionalize
it) so only the initial process logs the service name; update references to
get_full_service_name, run_worker, _worker_loop, and read_settings accordingly.
- Around line 43-48: Replace the long inline ValueError message raised when
validating settings.antenna_service_name in worker.py with either a shorter,
single-line message (e.g., "AMI_ANTENNA_SERVICE_NAME must be set") or raise a
custom exception type (e.g., AntennaConfigError) defined near the top of the
module; update the raise site to use the new exception class or shorter text and
add the custom exception class definition (subclassing ValueError) so the intent
remains clear while satisfying TRY003.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
trapdata/antenna/worker.py (1)

43-48: Ruff TRY003: consider a dedicated exception or shorter inline message.

Both the auth-token check (lines 37–41) and this block use long ValueError messages inline. Ruff TRY003 flags these; moving the error text into a custom exception class (or a module-level constant string) would silence it.

♻️ One lightweight approach
+class MissingServiceNameError(ValueError):
+    """Raised when AMI_ANTENNA_SERVICE_NAME is not configured."""

-    if not settings.antenna_service_name or not settings.antenna_service_name.strip():
-        raise ValueError(
-            "AMI_ANTENNA_SERVICE_NAME configuration setting must be set. "
-            "Configure it via environment variable or .env file."
-        )
+    if not settings.antenna_service_name or not settings.antenna_service_name.strip():
+        raise MissingServiceNameError(
+            "AMI_ANTENNA_SERVICE_NAME configuration setting must be set. "
+            "Configure it via environment variable or .env file."
+        )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@trapdata/antenna/worker.py` around lines 43 - 48, Replace the inline long
ValueError message raised for settings.antenna_service_name with a shorter raise
that references a module-level constant or a dedicated exception: define a
module-level constant (e.g., ANTENNA_SERVICE_NAME_ERROR = "...") or a custom
exception class (e.g., ConfigurationError) at top of the module, move the full
explanatory message into that constant, and then change the raise in the
validation block to raise ConfigurationError(ANTENNA_SERVICE_NAME_ERROR) or
raise ValueError(ANTENNA_SERVICE_NAME_ERROR) with a short inline message if you
prefer; update the auth-token check to use the same pattern for consistency
(referencing settings.antenna_service_name, the validation block that currently
raises ValueError, and the auth-token check).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@trapdata/antenna/worker.py`:
- Around line 43-48: Replace the inline long ValueError message raised for
settings.antenna_service_name with a shorter raise that references a
module-level constant or a dedicated exception: define a module-level constant
(e.g., ANTENNA_SERVICE_NAME_ERROR = "...") or a custom exception class (e.g.,
ConfigurationError) at top of the module, move the full explanatory message into
that constant, and then change the raise in the validation block to raise
ConfigurationError(ANTENNA_SERVICE_NAME_ERROR) or raise
ValueError(ANTENNA_SERVICE_NAME_ERROR) with a short inline message if you
prefer; update the auth-token check to use the same pattern for consistency
(referencing settings.antenna_service_name, the validation block that currently
raises ValueError, and the auth-token check).

@mihow mihow merged commit 1faa7d0 into RolnickLab:main Feb 19, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments