Add get_issue_committers RPC to Seer tools#118065
Conversation
wedamija
left a comment
There was a problem hiding this comment.
I'm reasonably new to working with Seer and I'm also integrating some tools. One thing I'm curious about here if this is already covered by the Sentry MCP that allows api access?
Another thought - if it's not available via the api, and it's useful for an LLM to have, would we be better off building this as an API so that other MCP clients can make use of it as well?
@wedamija good question for sure, and I'm no expert either. I think the only thing that's currently exposed is a private API endpoint for the suspect commit / author, not the set of possible commit candidates etc. I'd be all for making this more available to MCP etc. if it proves useful, but that would warrant much more thoughtful consideration than I'm doing here. :-P (I assume once we give it, it's hard to take away, etc.) I'm mostly trying to enable local testing with prod data and AFAICT this is the only way I can expose it to my local Seer calls. |
Expose Sentry's ingested suspect commits (GroupOwner SUSPECT_COMMIT -> Commit -> CommitAuthor) to Seer agents via a new get_issue_committers RPC method, registered in both the internal and public-org seer-rpc registries. This lets the smart_assignment agent find suspect-commit authors from ingested commit data without any SCM credentials or live GitHub calls (unlike git_search). Reflects the same suspect-commit signal shown in the UI. Co-authored-by: Cursor <cursoragent@cursor.com>
The handler previously returned only the precomputed GroupOwner suspect commit, which is empty for many issues. Also compute the release-based blame of the failing stack frames (the input to the suspect-commit feature) so the agent gets recent authors of the failing code even when no suspect commit exists. Returns both `committers` (frame-based, scored) and `suspect_commits`, still without any SCM/GitHub call. Co-authored-by: Cursor <cursoragent@cursor.com>
Match get_issue_details/get_event_details by accepting an optional start/end window, forwarded to _get_recommended_event so callers can scope which event's stacktrace is blamed. Co-authored-by: Cursor <cursoragent@cursor.com>
Beyond exact-frame blame and the single GroupOwner suspect commit, return a broader pool of commits shipped around when the issue first appeared (not frame-matched), each enriched with PR title/body, file-change count, and a merge-commit flag. The window defaults to ~6 weeks before first_seen and is overridable via start/end. This surfaces regressions in code that doesn't appear in the stacktrace, without any SCM/GitHub call. Co-authored-by: Cursor <cursoragent@cursor.com>
Rename the three commit signals to parallel, provenance-clear names: committers -> stack_commits, recent_commits -> release_commits (suspect_commits unchanged). Drop the redundant issue_first_seen field; callers get first_seen from get_issue_details. Co-authored-by: Cursor <cursoragent@cursor.com>
2dcc23c to
0558cab
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 0558cab. Configure here.
| group = Group.objects.get(project_id__in=project_ids, id=int(issue_id)) | ||
| else: | ||
| group = Group.objects.by_qualified_short_id( | ||
| organization_id, issue_id, project_ids=project_ids | ||
| ) |
There was a problem hiding this comment.
Group.objects.get() and by_qualified_short_id() raise DoesNotExist instead of returning None
Both the numeric path (Group.objects.get(project_id__in=project_ids, id=int(issue_id)) at line 1532) and the short-ID path (by_qualified_short_id(...) at line 1534) raise Group.DoesNotExist for deleted or merged issues, violating the stated contract of returning None on failure. Wrap both in a try/except Group.DoesNotExist and return None.
Evidence
Group.objects.get(project_id__in=project_ids, id=int(issue_id))at line 1532 has noDoesNotExisthandler.by_qualified_short_idat line 1534 callsby_qualified_short_id_bulk(...)[0];by_qualified_short_id_bulkexplicitly raisesGroup.DoesNotExistwhen the short ID is not found (models/group.py line 469:raise Group.DoesNotExist()).- Neither branch is wrapped in try/except, so a deleted, merged, or mis-typed issue ID will crash the entire function.
- The docstring at line 1513 promises "Returns None if the project/issue cannot be resolved", making the crash a contract violation.
Also found at 1 additional location
src/sentry/seer/endpoints/seer_rpc.py:1118
Identified by Warden sentry-backend-bugs · BA8-QZ8
| "get_trace_waterfall": seer_rpc(rpc_get_trace_waterfall), | ||
| "get_issue_and_event_details_v2": seer_rpc(get_issue_and_event_details_v2), | ||
| "get_issue_details": seer_rpc(get_issue_details), | ||
| "get_issue_committers": seer_rpc(get_issue_committers), |
There was a problem hiding this comment.
Commit author email addresses exposed in get_issue_committers RPC response to Seer AI service
The new get_issue_committers RPC (registered at organization_seer_rpc.py:140, implemented in tools.py:1474) returns real developer commit-author email addresses in all three response fields — stack_commits, suspect_commits, and release_commits — which are sent to the Seer AI service where they enter model prompts, inference context, and traces outside Sentry's controlled application tier. stack_commits author objects come from get_event_file_committers (Author = UserSerializerResponse | NonMappableUser, the latter being {name, email} from CommitAuthor.email); suspect_commits authors are built in _get_serialized_committers_from_group_owners as {"email": commit_author.email, "name": commit_author.name} or full UserSerializerResponse; release_commits use CommitSerializer(exclude=["repository"]), which does NOT exclude author, so the serialized author object (including email/emails) is emitted. Since user identity is needed for assignment, prefer passing a stable identifier (author/user ID) and resolving the email inside Sentry's tier, or hash/redact emails before they reach AI prompts/logs/traces.
Evidence
get_issue_committers(tools.py:1474) returnsstack_commits,suspect_commits,release_commits; the RPC is registered for Seer atorganization_seer_rpc.py:140.stack_commitssets"author": entry.get("author")fromget_event_file_committers; incommitters.py:281authors come fromget_users_for_commits, typedAuthor = UserSerializerResponse | NonMappableUserwhereNonMappableUseris{email, name}.suspect_commitsfrom_get_serialized_committers_from_group_owners(committers.py:229) buildsauthor = {"email": commit_author.email, "name": commit_author.name}or a fullUserSerializerResponse.release_commitsserialize withCommitSerializer(exclude=["repository"])(tools.py:1607); incommit.py:108-109authoris only dropped when inexclude, so the author object withemailis included.- The payload is returned over the Seer RPC, putting real developer emails into AI model prompts/inference context, a more durable and harder-to-purge sink than the existing UI committers view.
Identified by Warden wrdn-pii · WHY-Z6G
| "execute_issues_query": seer_rpc(map_org_id_param(execute_issues_query)), | ||
| "get_issue_and_event_details_v2": seer_rpc(get_issue_and_event_details_v2), | ||
| "get_issue_details": seer_rpc(get_issue_details), | ||
| "get_issue_committers": seer_rpc(get_issue_committers), |
There was a problem hiding this comment.
get_issue_committers exposes internal user fields (isSuperuser, isStaff, has2fa, hasPasswordAuth) of commit authors to Seer
The newly-registered get_issue_committers RPC returns commit-author data serialized with the full UserSerializerResponse, which unconditionally includes isSuperuser, isStaff, has2fa, hasPasswordAuth, isManaged, lastLogin, and lastActive—fields unrelated to commit attribution. Both the suspect_commits and stack_commits lists ship these verbatim to the Seer service. A scoped DTO with only id, name, and email should be returned instead. (Note: the emails list is gated on requester being the user or a superuser, and since the RPC serializes with as_user=None/anonymous it stays empty—so per-email exposure does not occur, contrary to a fuller reading; the role/2FA/account-state fields are the actual leak.)
Evidence
get_issue_committers(seer/agent/tools.py:1542) callsget_serialized_committers(group.project, group.id)forsuspect_commitsandget_event_file_committers(...)forstack_commits, setting each entry'sauthorto the serialized user dict._get_serialized_committers_from_group_owners(utils/committers.py:~229) setsauthor = user_service.serialize_many(filter={"user_ids": [owner.user_id]})[0], a fullUserSerializerResponse.UserSerializerResponse(users/api/serializers/user.py:94) andUserSerializer.serializeunconditionally populateisSuperuser,isStaff,has2fa,hasPasswordAuth,isManaged,lastLogin,lastActive;emailsis only filled when_user_is_requesteroruser.is_superuser, which is false here (as_user=None), so emails stay empty.- The stack path goes through
get_users_for_commits→get_users_for_authors(api/serializers/models/release.py:374) which likewise returns fullUserSerializerResponseobjects, withas_user=serialize_generic_user(None). - The returned dict is dispatched verbatim to the Seer service via the registered
get_issue_committersRPC (seer_rpc.py:1118,organization_seer_rpc.py:140).
Identified by Warden wrdn-data-exfil · 8LV-GSA

To enable smart issue assignment for blocked issues, Seer will need access to commit history to try to find the user whose commit caused an issue. While the assignment process itself is still a proof-of-concept, I need this RPC to work so I can try a local model invocation against production Sentry data in a realistic simulation.
Here we expose 3 things:
stack_commits: [ commit-author : [ commits ] ] list of author-grouped commits that match to lines in the issue stack trace. This is basically the input to Suspect Commits.suspect_commits: If a Suspect Commit was determined, this is it.release_commits: Since it might have been a change to a line not directly in the stack trace (if we even have one), also include recent commits before the issue first appeared.NOTE: This is just to enable my testing: LOTS of things we can tweak here as this is productionized, and we might even remove elements that prove not useful.