Send processing_service_name in requests to antenna#108
Send processing_service_name in requests to antenna#108mihow merged 6 commits intoRolnickLab:mainfrom
Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📝 WalkthroughWalkthroughAdds a configurable antenna service name ( Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as CLI (register)
participant Settings as Settings
participant Registration as Antenna Registration
participant Worker as Worker
participant Client as Antenna Client
participant REST as REST API
CLI->>Settings: read_settings()
Settings-->>CLI: antenna_service_name
CLI->>Registration: register_pipelines(service_name)
Registration->>Client: get_full_service_name(service_name)
Client-->>Registration: full_service_name
Registration->>Worker: start with processing_service_name
Worker->>Client: get_jobs(pipeline, processing_service_name)
Client->>REST: GET /jobs?processing_service_name=...
REST-->>Client: jobs
Client-->>Worker: jobs
Worker->>REST: fetch batch via RESTDataset (processing_service_name)
REST-->>Worker: tasks
Worker->>Client: post_batch_results(job_id, results, processing_service_name)
Client->>REST: POST /jobs/{id}/results?processing_service_name=...
REST-->>Client: ack
Client-->>Worker: success/failure
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
trapdata/antenna/client.py (1)
25-74:⚠️ Potential issue | 🔴 CriticalUpdate test call sites to include the required
processing_service_nameparameter.Both
get_jobsandpost_batch_resultsnow requireprocessing_service_name. The production calls intrapdata/antenna/worker.pyare correctly updated, but two test calls intrapdata/antenna/tests/test_worker.pyare missing this parameter:
- Line 197:
get_jobs("http://testserver/api/v2", "test-token", "moths_2024")- Line 421:
get_jobs("http://testserver/api/v2", "test-token", pipeline_slug,)Both need to include
processing_service_nameas the fourth argument.
🤖 Fix all issues with AI agents
In @.env.example:
- Line 17: The AMI_ANTENNA_SERVICE_NAME value in .env.example contains spaces
and should be quoted to avoid dotenv parsing issues; update the
AMI_ANTENNA_SERVICE_NAME entry so its value is wrapped in quotes (e.g., "AMI
Data Companion") to satisfy dotenv-linter and ensure correct parsing by dotenv
parsers.
In `@trapdata/antenna/datasets.py`:
- Around line 89-93: The params dict in the method building the request (using
url = f"{self.base_url.rstrip('/')}/jobs/{self.job_id}/tasks") always includes
"processing_service_name" even when empty; change the code that constructs
params (where "batch": self.batch_size and "processing_service_name":
self.processing_service_name are set) to only add the "processing_service_name"
key when self.processing_service_name is truthy (non-empty) so the request does
not send an empty-string filter to the API.
There was a problem hiding this comment.
🧹 Nitpick comments (3)
trapdata/antenna/tests/test_worker.py (1)
222-224: Mock server silently dropsprocessing_service_name— tests don't validate the parameter is forwarded.The
get_jobsendpoint inantenna_api_server.py(lines 29-43) only declares(pipeline__slug, ids_only, incomplete_only). FastAPI ignores undeclared query parameters, so the newprocessing_service_namequery param sent by the client is silently discarded. The tests will continue to pass even if the client stops sending the parameter entirely.Consider adding
processing_service_name: str = ""to the mock server'sget_jobssignature and asserting on it intest_returns_job_ids(and the end-to-end test). This would make the test an actual contract test for the new parameter.♻️ Suggested update to the mock server endpoint
-def get_jobs(pipeline__slug: str, ids_only: int, incomplete_only: int): +def get_jobs( + pipeline__slug: str, + ids_only: int, + incomplete_only: int, + processing_service_name: str = "", +):Then in
test_returns_job_ids:with patch_antenna_api_requests(self.antenna_client): result = get_jobs( "http://testserver/api/v2", "test-token", "moths_2024", "Test Worker" ) + # Optionally: assert the mock captured the service name assert result == [10, 20, 30]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@trapdata/antenna/tests/test_worker.py` around lines 222 - 224, The mock server's get_jobs handler is dropping the processing_service_name query parameter; update the get_jobs function signature to accept processing_service_name: str = "" and then add assertions in test_returns_job_ids (and the end-to-end test) to verify the handler received the expected value (e.g., "Test Worker") so the tests validate that the client forwards processing_service_name; update any call sites or test setup to pass the expected string and fail the test if the parameter is absent or different.trapdata/antenna/worker.py (2)
50-52: Duplicateget_full_service_namecall and log line in single-process path.In the CPU / single-GPU code path,
run_workercomputesfull_service_nameand logs it at line 52, then immediately calls_worker_loop, which callsread_settings()again and logs the same message at line 91 — two identical log lines in the same process, which can confuse operators.
full_service_nameis a plainstrand is picklable, so it can be passed throughmp.spawn'sargs. This also eliminates the minor re-read window whereantenna_service_namecould theoretically differ between therun_workervalidation and the child-processread_settings().♻️ Suggested refactor
def run_worker(pipelines: list[str]): ... # Build full service name with hostname full_service_name = get_full_service_name(settings.antenna_service_name) logger.info(f"Running worker as: {full_service_name}") gpu_count = torch.cuda.device_count() if gpu_count > 1: logger.info(f"Found {gpu_count} GPUs, spawning one AMI worker instance per GPU") mp.spawn( _worker_loop, - args=(pipelines,), + args=(pipelines, full_service_name), nprocs=gpu_count, join=True, ) else: ... - _worker_loop(0, pipelines) + _worker_loop(0, pipelines, full_service_name) -def _worker_loop(gpu_id: int, pipelines: list[str]): +def _worker_loop(gpu_id: int, pipelines: list[str], full_service_name: str): settings = read_settings() ... - # Build full service name with hostname - full_service_name = get_full_service_name(settings.antenna_service_name) - logger.info(f"Running worker as: {full_service_name}") + logger.info(f"[GPU {gpu_id}] Running worker as: {full_service_name}") while True: ...Also applies to: 89-91
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@trapdata/antenna/worker.py` around lines 50 - 52, run_worker currently computes full_service_name via get_full_service_name(settings.antenna_service_name), logs it, then calls mp.spawn which starts _worker_loop that calls read_settings() and logs the same full_service_name again; remove the duplicate by computing full_service_name once in run_worker and pass that string into mp.spawn via the args so child processes use the provided full_service_name instead of calling read_settings() again, and delete the redundant logger.info in _worker_loop (or conditionalize it) so only the initial process logs the service name; update references to get_full_service_name, run_worker, _worker_loop, and read_settings accordingly.
43-48: Ruff TRY003 — long inline message; consider a custom exception or a shorter message.The static analysis tool flags that exception messages this long should live in a custom exception class. In "chill" mode this is a low-priority nit, but it's an easy clean-up.
♻️ Suggested quick fix (shorter message)
- if not settings.antenna_service_name or not settings.antenna_service_name.strip(): - raise ValueError( - "AMI_ANTENNA_SERVICE_NAME configuration setting must be set. " - "Configure it via environment variable or .env file." - ) + if not settings.antenna_service_name or not settings.antenna_service_name.strip(): + raise ValueError("AMI_ANTENNA_SERVICE_NAME must be set (env var or .env file).")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@trapdata/antenna/worker.py` around lines 43 - 48, Replace the long inline ValueError message raised when validating settings.antenna_service_name in worker.py with either a shorter, single-line message (e.g., "AMI_ANTENNA_SERVICE_NAME must be set") or raise a custom exception type (e.g., AntennaConfigError) defined near the top of the module; update the raise site to use the new exception class or shorter text and add the custom exception class definition (subclassing ValueError) so the intent remains clear while satisfying TRY003.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@trapdata/antenna/tests/test_worker.py`:
- Around line 222-224: The mock server's get_jobs handler is dropping the
processing_service_name query parameter; update the get_jobs function signature
to accept processing_service_name: str = "" and then add assertions in
test_returns_job_ids (and the end-to-end test) to verify the handler received
the expected value (e.g., "Test Worker") so the tests validate that the client
forwards processing_service_name; update any call sites or test setup to pass
the expected string and fail the test if the parameter is absent or different.
In `@trapdata/antenna/worker.py`:
- Around line 50-52: run_worker currently computes full_service_name via
get_full_service_name(settings.antenna_service_name), logs it, then calls
mp.spawn which starts _worker_loop that calls read_settings() and logs the same
full_service_name again; remove the duplicate by computing full_service_name
once in run_worker and pass that string into mp.spawn via the args so child
processes use the provided full_service_name instead of calling read_settings()
again, and delete the redundant logger.info in _worker_loop (or conditionalize
it) so only the initial process logs the service name; update references to
get_full_service_name, run_worker, _worker_loop, and read_settings accordingly.
- Around line 43-48: Replace the long inline ValueError message raised when
validating settings.antenna_service_name in worker.py with either a shorter,
single-line message (e.g., "AMI_ANTENNA_SERVICE_NAME must be set") or raise a
custom exception type (e.g., AntennaConfigError) defined near the top of the
module; update the raise site to use the new exception class or shorter text and
add the custom exception class definition (subclassing ValueError) so the intent
remains clear while satisfying TRY003.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
trapdata/antenna/worker.py (1)
43-48: Ruff TRY003: consider a dedicated exception or shorter inline message.Both the auth-token check (lines 37–41) and this block use long
ValueErrormessages inline. Ruff TRY003 flags these; moving the error text into a custom exception class (or a module-level constant string) would silence it.♻️ One lightweight approach
+class MissingServiceNameError(ValueError): + """Raised when AMI_ANTENNA_SERVICE_NAME is not configured.""" - if not settings.antenna_service_name or not settings.antenna_service_name.strip(): - raise ValueError( - "AMI_ANTENNA_SERVICE_NAME configuration setting must be set. " - "Configure it via environment variable or .env file." - ) + if not settings.antenna_service_name or not settings.antenna_service_name.strip(): + raise MissingServiceNameError( + "AMI_ANTENNA_SERVICE_NAME configuration setting must be set. " + "Configure it via environment variable or .env file." + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@trapdata/antenna/worker.py` around lines 43 - 48, Replace the inline long ValueError message raised for settings.antenna_service_name with a shorter raise that references a module-level constant or a dedicated exception: define a module-level constant (e.g., ANTENNA_SERVICE_NAME_ERROR = "...") or a custom exception class (e.g., ConfigurationError) at top of the module, move the full explanatory message into that constant, and then change the raise in the validation block to raise ConfigurationError(ANTENNA_SERVICE_NAME_ERROR) or raise ValueError(ANTENNA_SERVICE_NAME_ERROR) with a short inline message if you prefer; update the auth-token check to use the same pattern for consistency (referencing settings.antenna_service_name, the validation block that currently raises ValueError, and the auth-token check).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@trapdata/antenna/worker.py`:
- Around line 43-48: Replace the inline long ValueError message raised for
settings.antenna_service_name with a shorter raise that references a
module-level constant or a dedicated exception: define a module-level constant
(e.g., ANTENNA_SERVICE_NAME_ERROR = "...") or a custom exception class (e.g.,
ConfigurationError) at top of the module, move the full explanatory message into
that constant, and then change the raise in the validation block to raise
ConfigurationError(ANTENNA_SERVICE_NAME_ERROR) or raise
ValueError(ANTENNA_SERVICE_NAME_ERROR) with a short inline message if you
prefer; update the auth-token check to use the same pattern for consistency
(referencing settings.antenna_service_name, the validation block that currently
raises ValueError, and the auth-token check).
Summary
Adds a configurable
processing_service_namesetting (default: "AMI Data Companion") that is sent as a query parameter to Antenna job endpoints. This allows Antenna to log which processing service is making requests, for auditing and debugging.The service name is read from settings/
.envand propagated through job fetching, data loading, and result submission.Companion PR
Server-side changes: RolnickLab/antenna#1117
Related Issues
Support for RolnickLab/antenna#1112
Support for RolnickLab/antenna#1087
Summary by CodeRabbit
New Features
Bug Fixes
Chores
Tests