Skip to content

Add get_issue_committers RPC to Seer tools#118065

Open
hobzcalvin wants to merge 7 commits into
masterfrom
seer-commit-history
Open

Add get_issue_committers RPC to Seer tools#118065
hobzcalvin wants to merge 7 commits into
masterfrom
seer-commit-history

Conversation

@hobzcalvin

Copy link
Copy Markdown
Contributor

To enable smart issue assignment for blocked issues, Seer will need access to commit history to try to find the user whose commit caused an issue. While the assignment process itself is still a proof-of-concept, I need this RPC to work so I can try a local model invocation against production Sentry data in a realistic simulation.

Here we expose 3 things:

  • stack_commits: [ commit-author : [ commits ] ] list of author-grouped commits that match to lines in the issue stack trace. This is basically the input to Suspect Commits.
  • suspect_commits: If a Suspect Commit was determined, this is it.
  • release_commits: Since it might have been a change to a line not directly in the stack trace (if we even have one), also include recent commits before the issue first appeared.

NOTE: This is just to enable my testing: LOTS of things we can tweak here as this is productionized, and we might even remove elements that prove not useful.

@hobzcalvin hobzcalvin requested review from a team as code owners June 18, 2026 20:31
@hobzcalvin hobzcalvin requested a review from saponifi3d June 18, 2026 20:31
@github-actions github-actions Bot added the Scope: Backend Automatically applied to PRs that change backend components label Jun 18, 2026

@wedamija wedamija left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm reasonably new to working with Seer and I'm also integrating some tools. One thing I'm curious about here if this is already covered by the Sentry MCP that allows api access?

Another thought - if it's not available via the api, and it's useful for an LLM to have, would we be better off building this as an API so that other MCP clients can make use of it as well?

@hobzcalvin

Copy link
Copy Markdown
Contributor Author

I'm reasonably new to working with Seer and I'm also integrating some tools. One thing I'm curious about here if this is already covered by the Sentry MCP that allows api access?

Another thought - if it's not available via the api, and it's useful for an LLM to have, would we be better off building this as an API so that other MCP clients can make use of it as well?

@wedamija good question for sure, and I'm no expert either. I think the only thing that's currently exposed is a private API endpoint for the suspect commit / author, not the set of possible commit candidates etc.

I'd be all for making this more available to MCP etc. if it proves useful, but that would warrant much more thoughtful consideration than I'm doing here. :-P (I assume once we give it, it's hard to take away, etc.) I'm mostly trying to enable local testing with prod data and AFAICT this is the only way I can expose it to my local Seer calls.

hobzcalvin and others added 6 commits June 18, 2026 15:07
Expose Sentry's ingested suspect commits (GroupOwner SUSPECT_COMMIT ->
Commit -> CommitAuthor) to Seer agents via a new get_issue_committers RPC
method, registered in both the internal and public-org seer-rpc registries.

This lets the smart_assignment agent find suspect-commit authors from
ingested commit data without any SCM credentials or live GitHub calls
(unlike git_search). Reflects the same suspect-commit signal shown in the UI.

Co-authored-by: Cursor <cursoragent@cursor.com>
The handler previously returned only the precomputed GroupOwner suspect
commit, which is empty for many issues. Also compute the release-based
blame of the failing stack frames (the input to the suspect-commit
feature) so the agent gets recent authors of the failing code even when
no suspect commit exists. Returns both `committers` (frame-based, scored)
and `suspect_commits`, still without any SCM/GitHub call.

Co-authored-by: Cursor <cursoragent@cursor.com>
Match get_issue_details/get_event_details by accepting an optional
start/end window, forwarded to _get_recommended_event so callers can
scope which event's stacktrace is blamed.

Co-authored-by: Cursor <cursoragent@cursor.com>
Beyond exact-frame blame and the single GroupOwner suspect commit, return a
broader pool of commits shipped around when the issue first appeared (not
frame-matched), each enriched with PR title/body, file-change count, and a
merge-commit flag. The window defaults to ~6 weeks before first_seen and is
overridable via start/end. This surfaces regressions in code that doesn't
appear in the stacktrace, without any SCM/GitHub call.

Co-authored-by: Cursor <cursoragent@cursor.com>
Rename the three commit signals to parallel, provenance-clear names:
committers -> stack_commits, recent_commits -> release_commits (suspect_commits
unchanged). Drop the redundant issue_first_seen field; callers get first_seen
from get_issue_details.

Co-authored-by: Cursor <cursoragent@cursor.com>
@hobzcalvin hobzcalvin force-pushed the seer-commit-history branch from 2dcc23c to 0558cab Compare June 18, 2026 22:07
Comment thread src/sentry/seer/agent/tools.py Outdated

@cursor cursor Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 0558cab. Configure here.

Comment thread src/sentry/seer/agent/tools.py
@hobzcalvin hobzcalvin requested review from chromy and trevor-e June 18, 2026 22:14
Comment on lines +1532 to +1536
group = Group.objects.get(project_id__in=project_ids, id=int(issue_id))
else:
group = Group.objects.by_qualified_short_id(
organization_id, issue_id, project_ids=project_ids
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Group.objects.get() and by_qualified_short_id() raise DoesNotExist instead of returning None

Both the numeric path (Group.objects.get(project_id__in=project_ids, id=int(issue_id)) at line 1532) and the short-ID path (by_qualified_short_id(...) at line 1534) raise Group.DoesNotExist for deleted or merged issues, violating the stated contract of returning None on failure. Wrap both in a try/except Group.DoesNotExist and return None.

Evidence
  • Group.objects.get(project_id__in=project_ids, id=int(issue_id)) at line 1532 has no DoesNotExist handler.
  • by_qualified_short_id at line 1534 calls by_qualified_short_id_bulk(...)[0]; by_qualified_short_id_bulk explicitly raises Group.DoesNotExist when the short ID is not found (models/group.py line 469: raise Group.DoesNotExist()).
  • Neither branch is wrapped in try/except, so a deleted, merged, or mis-typed issue ID will crash the entire function.
  • The docstring at line 1513 promises "Returns None if the project/issue cannot be resolved", making the crash a contract violation.
Also found at 1 additional location
  • src/sentry/seer/endpoints/seer_rpc.py:1118

Identified by Warden sentry-backend-bugs · BA8-QZ8

"get_trace_waterfall": seer_rpc(rpc_get_trace_waterfall),
"get_issue_and_event_details_v2": seer_rpc(get_issue_and_event_details_v2),
"get_issue_details": seer_rpc(get_issue_details),
"get_issue_committers": seer_rpc(get_issue_committers),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit author email addresses exposed in get_issue_committers RPC response to Seer AI service

The new get_issue_committers RPC (registered at organization_seer_rpc.py:140, implemented in tools.py:1474) returns real developer commit-author email addresses in all three response fields — stack_commits, suspect_commits, and release_commits — which are sent to the Seer AI service where they enter model prompts, inference context, and traces outside Sentry's controlled application tier. stack_commits author objects come from get_event_file_committers (Author = UserSerializerResponse | NonMappableUser, the latter being {name, email} from CommitAuthor.email); suspect_commits authors are built in _get_serialized_committers_from_group_owners as {"email": commit_author.email, "name": commit_author.name} or full UserSerializerResponse; release_commits use CommitSerializer(exclude=["repository"]), which does NOT exclude author, so the serialized author object (including email/emails) is emitted. Since user identity is needed for assignment, prefer passing a stable identifier (author/user ID) and resolving the email inside Sentry's tier, or hash/redact emails before they reach AI prompts/logs/traces.

Evidence
  • get_issue_committers (tools.py:1474) returns stack_commits, suspect_commits, release_commits; the RPC is registered for Seer at organization_seer_rpc.py:140.
  • stack_commits sets "author": entry.get("author") from get_event_file_committers; in committers.py:281 authors come from get_users_for_commits, typed Author = UserSerializerResponse | NonMappableUser where NonMappableUser is {email, name}.
  • suspect_commits from _get_serialized_committers_from_group_owners (committers.py:229) builds author = {"email": commit_author.email, "name": commit_author.name} or a full UserSerializerResponse.
  • release_commits serialize with CommitSerializer(exclude=["repository"]) (tools.py:1607); in commit.py:108-109 author is only dropped when in exclude, so the author object with email is included.
  • The payload is returned over the Seer RPC, putting real developer emails into AI model prompts/inference context, a more durable and harder-to-purge sink than the existing UI committers view.

Identified by Warden wrdn-pii · WHY-Z6G

"execute_issues_query": seer_rpc(map_org_id_param(execute_issues_query)),
"get_issue_and_event_details_v2": seer_rpc(get_issue_and_event_details_v2),
"get_issue_details": seer_rpc(get_issue_details),
"get_issue_committers": seer_rpc(get_issue_committers),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_issue_committers exposes internal user fields (isSuperuser, isStaff, has2fa, hasPasswordAuth) of commit authors to Seer

The newly-registered get_issue_committers RPC returns commit-author data serialized with the full UserSerializerResponse, which unconditionally includes isSuperuser, isStaff, has2fa, hasPasswordAuth, isManaged, lastLogin, and lastActive—fields unrelated to commit attribution. Both the suspect_commits and stack_commits lists ship these verbatim to the Seer service. A scoped DTO with only id, name, and email should be returned instead. (Note: the emails list is gated on requester being the user or a superuser, and since the RPC serializes with as_user=None/anonymous it stays empty—so per-email exposure does not occur, contrary to a fuller reading; the role/2FA/account-state fields are the actual leak.)

Evidence
  • get_issue_committers (seer/agent/tools.py:1542) calls get_serialized_committers(group.project, group.id) for suspect_commits and get_event_file_committers(...) for stack_commits, setting each entry's author to the serialized user dict.
  • _get_serialized_committers_from_group_owners (utils/committers.py:~229) sets author = user_service.serialize_many(filter={"user_ids": [owner.user_id]})[0], a full UserSerializerResponse.
  • UserSerializerResponse (users/api/serializers/user.py:94) and UserSerializer.serialize unconditionally populate isSuperuser, isStaff, has2fa, hasPasswordAuth, isManaged, lastLogin, lastActive; emails is only filled when _user_is_requester or user.is_superuser, which is false here (as_user=None), so emails stay empty.
  • The stack path goes through get_users_for_commitsget_users_for_authors (api/serializers/models/release.py:374) which likewise returns full UserSerializerResponse objects, with as_user=serialize_generic_user(None).
  • The returned dict is dispatched verbatim to the Seer service via the registered get_issue_committers RPC (seer_rpc.py:1118, organization_seer_rpc.py:140).

Identified by Warden wrdn-data-exfil · 8LV-GSA

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants