fix(seer): use URL-safe GitLab path when building owner/name#118054
Conversation
For GitLab repos, repo.name is name_with_namespace (the human-readable
display name, e.g. 'My Group / My Project'). Splitting this by '/' to
extract owner and name yields parts with leading/trailing spaces, which
causes seer to construct SCM links that 404.
GitLab also stores path_with_namespace in repo.config['path'], which is
the URL-safe equivalent (e.g. 'my-group/my-project').
Introduces get_repo_url_path() helper in seer/autofix/utils.py that
returns repo.config['path'] for GitLab repos and repo.name for all
other providers. Applies it at the four call sites that previously did
repo.name.split('/') to extract owner/name.
Also fixes the hardcoded GitHub icon on git_search tool links in the
Seer evidence UI: detect the SCM provider from the URL hostname and
render IconGitlab for gitlab.com URLs instead of always showing
IconGithub.
Fixes CW-1528
Co-Authored-By: sentry-junior[bot] <264270552+sentry-junior[bot]@users.noreply.github.com>
Co-Authored-By: sentry-junior[bot] <264270552+sentry-junior[bot]@users.noreply.github.com>
| name_parts = get_repo_url_path(repo).split("/", 1) | ||
| owner = name_parts[0] if len(name_parts) > 1 else "" | ||
| name = name_parts[1] if len(name_parts) > 1 else repo.name | ||
| name = name_parts[1] if len(name_parts) > 1 else get_repo_url_path(repo) |
There was a problem hiding this comment.
this changes our api response, which could be unexpected if users expect to see their display name... this is more correct though since we assume that this is used as a "slug" like we do for github.
The GET seer repos endpoint filters repositories by the providers from get_supported_scm_providers(), which only includes GitLab when the organizations:seer-gitlab-support feature flag is enabled. The new test_gitlab_repo_uses_path_not_display_name test issued the request without enabling the flag, so the GitLab repo was filtered out and the endpoint returned an empty list, failing the assertion. Wrap the request in the seer-gitlab-support feature context so the GitLab repo is included and the path-vs-display-name assertion runs. Refs CW-1528 Co-Authored-By: Claude <noreply@anthropic.com>
|
|
||
| Returns None if Repository name is invalid.""" | ||
| repo = seer_project_repo.project_repository.repository | ||
| repo_name_sections = repo.name.split("/") |
There was a problem hiding this comment.
this has downstream (seer) implications since the repo definition is used in seer but i believe we already have utils in seer that strip whitespace, etc
|
@sentry review |
| URL-safe ``owner/repo`` string, so we return it unchanged. | ||
| """ | ||
| if repo.provider == "integrations:gitlab": | ||
| path = repo.config.get("path") |
There was a problem hiding this comment.
would we rather raise if there's no path as that's unexpected
also maybe this deserves some metric or log just to know how often we're falling back to repo.name in the gitlab case
There was a problem hiding this comment.
yeah good point, will change to throw. dont think we need a metric, ive looked at the db and there are no cases where path doesnt exist
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit bb78554. Configure here.
| return None | ||
|
|
||
| # Use the actual repo name from the database, not the requested name. | ||
| repo_name_parts = repo.name.split("/") |
There was a problem hiding this comment.
GitLab path name lookup mismatch
Medium Severity
get_repository_definition still resolves repositories by exact Repository.name, while this change makes GitLab owner/name come from get_repo_url_path (config["path"]). Callers that rebuild repo_full_name from those slug fields (without external_id) no longer match GitLab rows stored under name_with_namespace display names.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit bb78554. Configure here.
There was a problem hiding this comment.
Clanker checked (including seer):
I checked the three things that could break, and traced each through the actual Seer code (../seer):
-
GitLab API access — unaffected. Seer addresses GitLab repos exclusively by external_id (parsed as {netloc}:{project_id} at
scm/providers/gitlab/provider.py:238), never by owner/name. external_id is unchanged by this PR. -
Web/PR/commit URLs — fixed. web_repository_path does self.repository["name"].replace(" ", "") (provider.py:247). The old display name "My Group / My
Project" collapsed to "MyGroup/MyProject" — wrong case, no dashes → the exact 404 in your branch name. The new slug "my-group/my-project" makes that
.replace() a no-op and the URL correct. -
external_id discovery — improved. Seer's get_external_id_from_user_org_context (explorer/tools/utils.py:46-78) matches context repos against the LLM's
clean slug via _normalize_repo_full_name, which only trims whitespace around / (not case/dashes). Before, context held display names → "My Group/My Project"
≠ "my-group/my-project" → match often failed. Now the context repos come from the same get_autofix_repos_from_project_code_mappings you changed, so both
sides are clean slugs → they match → external_id is reliably found.
get_repo_url_path() requires config['path'] for GitLab repos and raises if it's missing, which is correct for production where path_with_namespace is always populated. The test fixtures were creating GitLab repos without it; set it so the fixtures reflect production reality.
bb78554 to
8d4bbeb
Compare
) ## Problem `git_search` tool links hardcoded `<IconGithub />` for all SCM providers. ## Change Reads `params.provider` (e.g. `'integrations:gitlab'`) from the tool link params and passes it to `getScmIcon()`. Returns `<IconGitlab />` for GitLab, `<IconGithub />` for everything else. Using the explicit provider string (rather than URL hostname sniffing) means this works correctly for self-hosted GitLab instances whose domain doesn't contain `'gitlab'`. Depends on getsentry/seer#7031 which adds `provider` to the `git_search` tool link params. Companion backend fix: #118054 Fixes https://linear.app/getsentry/issue/CW-1528 --- [View Session in Sentry](https://sentry.sentry.io/traces/?project=4510944073809921&query=gen_ai.conversation.id%3A%22slack%3AC013P75SQUB%3A1781810595.615099%22) --------- Co-authored-by: sentry-junior[bot] <264270552+sentry-junior[bot]@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>


Problem
For GitLab repos,
repo.nameisname_with_namespace— the human-readable display name, e.g."My Group / My Project"(spaces around the slash). Splitting this on/to extractownerandnameproduces parts with leading/trailing whitespace, which causes Seer to construct SCM links that 404.Root cause confirmed by comparing the GitLab integration's own
format_source_url, which already usesrepo.config["path"](path_with_namespace, e.g."my-group/my-project") to build valid URLs.Changes
Introduce
get_repo_url_path(repo)helper inseer/autofix/utils.py:repo.config["path"]forintegrations:gitlabrepos (URL-safepath_with_namespace).repo.nameunchanged for GitHub and all other providers.Applied at the four call sites that previously did
repo.name.split("/"):seer/autofix/utils.py—build_repo_definition_from_project_repoandget_autofix_repos_from_project_code_mappingsseer/endpoints/project_seer_repos.py—_serialize_project_reposeer/agent/tools.py—get_repository_definitionTesting
Added
test_gitlab_repo_uses_path_not_display_nameintest_project_seer_repos.pythat creates a GitLab repo with a display name containing spaces ("My Group / My Project") and aconfig.pathof"my-group/my-project", then asserts the serializedowner/namefields are the URL-safe values.Companion frontend fix (icon): #118055
Fixes https://linear.app/getsentry/issue/CW-1528
View Session in Sentry