Skip to content

feat(scm-multi-platform-detection): Extending functionality with capped reads#117745

Merged
Abdkhan14 merged 6 commits into
abdk/multi-platform-detection-v8from
abdk/multi-platform-detection-v9
Jun 18, 2026
Merged

feat(scm-multi-platform-detection): Extending functionality with capped reads#117745
Abdkhan14 merged 6 commits into
abdk/multi-platform-detection-v8from
abdk/multi-platform-detection-v9

Conversation

@Abdkhan14

Copy link
Copy Markdown
Contributor

Unified _rule_parent_dirs and _framework_matches_scoped to handle all rule types — existence, match_content, and match_package — in a single pass each. Both now accept content_by_path and manifests_by_path; passing empty dicts for these in the existence pass makes content/package rules silently return empty sets with no special-casing needed.

Added content reads to detect_platforms_multi as a second high-confidence pass. After existence matching, _collect_needed_paths identifies candidate files, up to MAX_CONTENT_READS = 5 are fetched using the existing _get_repo_file_content and _parse_package_manifest helpers, sorted shallowest-first so root manifests are always within the cap. Supersession runs after both passes so content-detected frameworks can supersede existence matches.

Added TestCollectNeededPaths and TestDetectPlatformsMulti covering the major branches: content-only detection, package-only detection, the shallow-first cap, content-driven supersession, and the zero-reads case.

@github-actions github-actions Bot added the Scope: Backend Automatically applied to PRs that change backend components label Jun 15, 2026
@Abdkhan14 Abdkhan14 changed the title feat(scm-multi-platform-detection): Extending functionality with capp… feat(scm-multi-platform-detection): Extending functionality with capped reads Jun 16, 2026
@Abdkhan14 Abdkhan14 marked this pull request as ready for review June 16, 2026 17:35
@Abdkhan14 Abdkhan14 requested review from a team as code owners June 16, 2026 17:35

@cursor cursor Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 55baf7e. Configure here.

Comment thread src/sentry/integrations/github/multi_platform_detection.py Outdated
Comment on lines +360 to +370
if "match_content" in rule:
pattern = rule["match_content"]
path_filter = rule.get("path")
ext_filter = rule.get("match_ext")
result: set[str] = set()
for full_path, content in content_by_path.items():
basename = full_path.rsplit("/", 1)[-1]
if path_filter and basename != path_filter:
continue
if ext_filter and not basename.endswith(ext_filter):
continue

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The co-location check fails when an every rule matches a root directory and a some rule matches content in a nested file, causing framework detection to fail.
Severity: MEDIUM

Suggested Fix

The co-location logic should be adjusted to correctly handle cases where one scope is the root ("") and the other is a subdirectory. The check should succeed if the file's parent directory is within the directory matched by the every rule. This could involve treating the root scope as a valid parent for any subdirectory scope.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.

Location: src/sentry/integrations/github/multi_platform_detection.py#L360-L370

Potential issue: The framework detection logic incorrectly handles co-location checks
between directory-based (`every`) and content-based (`some`) rules. When an `every` rule
matches a directory at the repository root (e.g., `app/`), its scope is determined as
the root (`""`). If a `some` rule then matches content within a file inside that
directory (e.g., `app/build.gradle`), its scope is the subdirectory (`"app"`). The
subsequent check fails because the root scope `""` is not found within the
subdirectory's scope `{"app"}`. This leads to false negatives in framework detection,
particularly for Android projects where the `android` marker might only exist in a
nested `build.gradle` file.

Did we get this right? 👍 / 👎 to inform future reviews.

Abdkhan14 and others added 2 commits June 18, 2026 12:33
Adds unit and integration tests for the previously untested parts of
`multi_platform_detection.py`, and cleans up minor quality issues in the
existing test suite.

New test coverage:

`TestSelectActivePlatforms` — `MAX_LANGUAGES` cap, language grouping
(TypeScript + JavaScript → one javascript slot), `IGNORED_LANGUAGES`
skipped, byte-count descending ordering
`TestPathIsIgnored` — segment-vs-substring correctness (e.g.
build.gradle is not ignored even though build/ is an ignored directory)
`TestGetTree` — defensive parsing of non-dict responses, missing tree
key, truncated flag propagation
TestDetectPlatformsMulti (extended) — existence-only Pass 1 high match
with zero content reads, co-location false-positive prevention
end-to-end, high confidence sorts before medium regardless of byte count
Test quality cleanup:
@Abdkhan14 Abdkhan14 merged commit 5ab1180 into abdk/multi-platform-detection-v8 Jun 18, 2026
13 of 18 checks passed
@Abdkhan14 Abdkhan14 deleted the abdk/multi-platform-detection-v9 branch June 18, 2026 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants