feat(seer): Add deliver_feature_result RPC for Seer agent features#116734
Conversation
Add a generic RPC endpoint that Seer can call when an agent feature run completes. Features register handlers in DELIVERY_HANDLERS keyed by feature_id. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Register the night_shift handler in DELIVERY_HANDLERS to receive triage results from Seer. This runs in parallel with the existing night shift code. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Use seer_run__uuid to find the SeerNightShiftRun since ref is the external idempotency key from SeerRun.uuid. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Register handlers directly in DELIVERY_HANDLERS instead of relying on import side effects. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Rename ref to run_uuid and remove seer_run_id since we can get it from SeerRun if needed. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Put organization_id first and include it in the query for safety. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Use sentry_sdk.metrics.count() with attributes instead of incr() with tags - Add extras["action"] to SeerNightShiftRunResult for tracking - Fix circular import by defining FeatureRunStatus locally - Make TriageVerdict.reason optional with empty string default Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
There are 4 total unresolved issues (including 2 from previous reviews).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 0c0baaa. Configure here.
| ) | ||
| return | ||
|
|
||
| handler(organization_id, run_uuid, status, result, error) |
There was a problem hiding this comment.
Autofix runs inside Seer RPC
Medium Severity
deliver_feature_result invokes the night-shift handler synchronously in the Seer RPC request, and that handler can call trigger_autofix_agent for every fixable verdict. The prior cron path runs the same autofix loop inside a background task with a long processing deadline.
Reviewed by Cursor Bugbot for commit 0c0baaa. Configure here.
There was a problem hiding this comment.
our result handling is very lightweight and should work fine within the RPC
…metric Match existing cron.py behavior by counting only fixable verdicts (excluding SKIPs) in the candidates_selected distribution metric. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Import FeatureRunStatus from delivery.py instead of defining it twice. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Define FeatureRunStatus in sentry.seer.agent.types so handlers can import it without circular dependencies. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add comprehensive test coverage for the night shift delivery handler: - Missing/invalid run handling - Error status and invalid result parsing - SKIP, AUTOFIX, and ROOT_CAUSE_ONLY verdict processing - Dry run mode skipping autofix - Error recovery when trigger fails for one group - Unknown group_id filtering - User context passing to autofix Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add __all__ to feature_delivery.py to explicitly export FeatureRunStatus - Add type annotations to all test functions - Add assertions for nullable seer_run FK before accessing uuid Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Share the autofix triggering logic between the delivery handler and the existing cron job. The delivery handler now: - Converts TriageVerdict to TriageResult - Calls the shared _run_autofix_for_candidates function from cron.py This reduces code duplication and makes it easier to switch over to the new feature delivery system without introducing bugs. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…oject Set organization on each group's project directly instead of building an intermediate project_by_id dict. This ensures all groups have the cached organization regardless of Django's object identity behavior with select_related. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace per-candidate read_preference_from_sentry_db calls with a single bulk_read_preferences_from_sentry_db call to avoid N+1 queries. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add project__status=ObjectStatus.ACTIVE filter to match cron.py behavior and prevent potential KeyError if a project becomes inactive during triage. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
If a project is deleted or moved between the groups query and the preferences query (extremely unlikely race), fall back to the default stopping point instead of raising KeyError. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
| run = SeerNightShiftRun.objects.select_related("organization", "seer_run").get( | ||
| organization_id=organization_id, | ||
| seer_run__uuid=run_uuid, | ||
| ) |
There was a problem hiding this comment.
Bug: The lookup in deliver_feature_result uses seer_run__uuid, which fails if the nullable seer_run foreign key hasn't been populated yet, causing results from Seer to be silently dropped.
Severity: HIGH
Suggested Fix
Modify the query to not rely on the seer_run foreign key, which may be null. One option is to add the run_uuid to the SeerNightShiftRun model directly and query on that field. This avoids the join on a nullable foreign key and prevents the race condition where the callback from Seer arrives before the seer_run association is made.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: src/sentry/seer/night_shift/delivery.py#L33-L36
Potential issue: The `deliver_feature_result` function looks up a `SeerNightShiftRun`
record by joining on the `seer_run` foreign key using `seer_run__uuid`. However, the
`seer_run` field is nullable and is populated asynchronously after an agent run
completes. A race condition exists where Seer might call this endpoint with a `run_uuid`
before the corresponding `seer_run` foreign key has been set on the `SeerNightShiftRun`
record. In this scenario, the database query will fail to find the record due to the
INNER JOIN, causing a `DoesNotExist` exception. The handler catches this, logs a
warning, and silently discards the result from Seer, leading to data loss.


Summary
Adds a generic RPC endpoint (
deliver_feature_result) that Seer can call when an agent feature run completes. Features register handlers inDELIVERY_HANDLERSkeyed byfeature_id.Also adds the
night_shiftdelivery handler that:SeerNightShiftRunby refSeerNightShiftRunResultrowsThis runs in parallel with the existing night shift code - nothing is removed or modified. Once the Seer-side PR lands and we verify it works, a follow-up PR will migrate night shift to use the new delivery path exclusively.
Test plan