fix(core): Bound git branch analysis to prevent subprocess exhaustion#2357
fix(core): Bound git branch analysis to prevent subprocess exhaustion#2357Abhishek2005-ard wants to merge 8 commits into
Conversation
… package is absent
|
@Abhishek2005-ard is attempting to deploy a commit to the Nisshchaya's projects Team on Vercel. A member of the Team first needs to authorize it. |
📝 WalkthroughWalkthroughThe PR adds an IDOR authorization check to the annotations API route, caps branch processing in ChangesSecurity, Stability & Test Infrastructure
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🎉 Thanks for your contribution, @Abhishek2005-ard!Your PR has passed our automated GSSoC quality checks. Here's a quick summary:
A maintainer will review your PR soon. Please be patient and available for feedback. 💪 GSSoC'26 automation · Maintainer: @nisshchayarathi |
1 similar comment
🎉 Thanks for your contribution, @Abhishek2005-ard!Your PR has passed our automated GSSoC quality checks. Here's a quick summary:
A maintainer will review your PR soon. Please be patient and available for feedback. 💪 GSSoC'26 automation · Maintainer: @nisshchayarathi |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (3)
app/api/auth/sessions/__tests__/route.test.ts (1)
17-17: ⚡ Quick winAssert
session.deleteManyis called in DELETE success path.Line 17 adds the mock, but no test verifies the DELETE handler actually invokes it. Add a named mock and assert it in the
"increments tokenVersion..."case so session invalidation remains covered.Suggested patch
const mockFindUnique = jest.fn(); const mockFindMany = jest.fn(); const mockUpdate = jest.fn(); +const mockDeleteMany = jest.fn(); jest.mock("`@/lib/prisma`", () => ({ __esModule: true, default: { @@ session: { findMany: (...args: any[]) => mockFindMany(...args), - deleteMany: jest.fn(), + deleteMany: (...args: any[]) => mockDeleteMany(...args), }, }, })); @@ it("increments tokenVersion and updates passwordChangedAt", async () => { @@ expect(mockUpdate).toHaveBeenCalledWith( @@ ); + expect(mockDeleteMany).toHaveBeenCalledWith({ where: { userId: 1 } }); const body = await response.json(); expect(body.message).toContain("terminated"); });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/api/auth/sessions/__tests__/route.test.ts` at line 17, The deleteMany mock is defined but not asserted anywhere, leaving session invalidation uncovered by tests. Assign the jest.fn() call for deleteMany to a named variable so it can be referenced, then add an assertion in the "increments tokenVersion..." test case to verify that this deleteMany mock was called with the expected arguments when the DELETE handler executes the session invalidation path.lib/services/__tests__/repository-idor.test.ts (1)
237-255: ⚡ Quick winGuarantee spy cleanup with
try/finally.At Line 237 and Line 255, cleanup is not exception-safe. If any assertion fails before restore,
console.errorremains mocked for later tests.Suggested patch
const testInvalidRole = async (injectedRoleValue: any) => { const consoleErrorSpy = jest.spyOn(console, "error").mockImplementation(() => {}); - (prisma.repository.findUnique as jest.Mock).mockResolvedValue({ - id: targetRepoId, - userId: directOwnerId, - }); - (prisma.repositoryPolicyAssignment.findUnique as jest.Mock).mockResolvedValue({ - organizationId: orgId, - }); - (prisma.organizationMember.findUnique as jest.Mock).mockResolvedValue({ - role: injectedRoleValue, - }); - - const result = await RepositoryAccess.checkAccess(targetRepoId, nonOwnerId); - expect(result.allowed).toBe(false); - expect(result.role).toBeUndefined(); - expect(result.reason).toContain("Invalid organization role"); - expect(result.repositoryExists).toBe(true); - - consoleErrorSpy.mockRestore(); + try { + (prisma.repository.findUnique as jest.Mock).mockResolvedValue({ + id: targetRepoId, + userId: directOwnerId, + }); + (prisma.repositoryPolicyAssignment.findUnique as jest.Mock).mockResolvedValue({ + organizationId: orgId, + }); + (prisma.organizationMember.findUnique as jest.Mock).mockResolvedValue({ + role: injectedRoleValue, + }); + + const result = await RepositoryAccess.checkAccess(targetRepoId, nonOwnerId); + expect(result.allowed).toBe(false); + expect(result.role).toBeUndefined(); + expect(result.reason).toContain("Invalid organization role"); + expect(result.repositoryExists).toBe(true); + } finally { + consoleErrorSpy.mockRestore(); + } };🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/services/__tests__/repository-idor.test.ts` around lines 237 - 255, The consoleErrorSpy mock cleanup is not exception-safe because if any assertion fails before reaching consoleErrorSpy.mockRestore(), the spy remains active and affects subsequent tests. Wrap the code that performs the mocks and assertions (from the prisma mock setup through the expect statements) in a try/finally block, moving consoleErrorSpy.mockRestore() to the finally block to guarantee it runs regardless of whether any test assertion throws an error.lib/services/__tests__/data-residency.test.ts (1)
7-13: Use partial mock pattern to avoid brittle test when dependencies change.The test currently mocks the entire
@prisma/clientmodule with onlyDataResidencyRegion. While the services being tested (region-router,compliance-enforcement,region-ai-router) currently import onlyDataResidencyRegion, this pattern becomes fragile if these services are updated to import additional exports (types, enums, etc.). Safer pattern: usejest.requireActual("@prisma/client")and override onlyDataResidencyRegion:jest.mock("`@prisma/client`", () => ({ ...jest.requireActual("`@prisma/client`"), DataResidencyRegion: { US: "US", EU: "EU", APAC: "APAC", }, }));🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/services/__tests__/data-residency.test.ts` around lines 7 - 13, The jest.mock for `@prisma/client` in the test file is incomplete and only defines DataResidencyRegion, which makes the test brittle to future changes. If the services being tested (region-router, compliance-enforcement, region-ai-router) later import additional exports from `@prisma/client`, those imports will be undefined. Fix this by using the partial mock pattern: spread all actual exports from `@prisma/client` using jest.requireActual("`@prisma/client`") in the mock callback, then override only the DataResidencyRegion object with the test values. This preserves all other module exports while safely mocking only what you need.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@app/api/annotations/route.ts`:
- Around line 20-31: Add strict numeric validation for the repositoryId
parameter at the beginning of the route handler, before any Prisma queries. Use
a validation method that rejects partial strings and returns NaN for non-numeric
inputs (such as checking if the string matches a numeric pattern or using
Number.isInteger after parsing). If repositoryId fails validation, return a 400
Bad Request response. Then store the validated numeric ID in a variable and
reuse it in both the findUnique query (around line 20) and the findMany query
(around line 31) instead of calling parseInt multiple times, which eliminates
the risk of accepting malformed input like "12abc" and ensures consistent
validation.
In `@lib/services/gitService.ts`:
- Around line 335-338: Move the abort check that validates opts?.signal?.aborted
to occur before the spawn call, rather than after. Currently the check happens
after the git clone subprocess has already been started, which wastes resources
for already-aborted operations. Relocate the conditional block containing the
opts?.signal?.aborted check with its reject call to execute before spawn is
invoked, ensuring that aborted signals are detected and handled before any
subprocess resources are allocated.
- Around line 435-451: The git for-each-ref command in the gitService.ts file
currently outputs all refs to stdout even though only MAX_BRANCHES_TO_PROCESS
entries are used. Add the --count=${MAX_BRANCHES_TO_PROCESS} flag to the
for-each-ref command array (alongside the existing flags like
--sort=-committerdate and --format) to limit the output at the Git command
level, preventing unnecessary work and memory consumption for repositories with
large ref sets.
In `@next.config.js`:
- Around line 17-23: The broad catch block in the withBundleAnalyzer
initialization swallows all exceptions, hiding legitimate configuration errors.
Narrow the catch block to specifically check if the caught error is a
MODULE_NOT_FOUND error (by examining the error code property); if it is, set
withBundleAnalyzer to the no-op fallback function, otherwise rethrow the error
to surface any genuine configuration issues.
---
Nitpick comments:
In `@app/api/auth/sessions/__tests__/route.test.ts`:
- Line 17: The deleteMany mock is defined but not asserted anywhere, leaving
session invalidation uncovered by tests. Assign the jest.fn() call for
deleteMany to a named variable so it can be referenced, then add an assertion in
the "increments tokenVersion..." test case to verify that this deleteMany mock
was called with the expected arguments when the DELETE handler executes the
session invalidation path.
In `@lib/services/__tests__/data-residency.test.ts`:
- Around line 7-13: The jest.mock for `@prisma/client` in the test file is
incomplete and only defines DataResidencyRegion, which makes the test brittle to
future changes. If the services being tested (region-router,
compliance-enforcement, region-ai-router) later import additional exports from
`@prisma/client`, those imports will be undefined. Fix this by using the partial
mock pattern: spread all actual exports from `@prisma/client` using
jest.requireActual("`@prisma/client`") in the mock callback, then override only
the DataResidencyRegion object with the test values. This preserves all other
module exports while safely mocking only what you need.
In `@lib/services/__tests__/repository-idor.test.ts`:
- Around line 237-255: The consoleErrorSpy mock cleanup is not exception-safe
because if any assertion fails before reaching consoleErrorSpy.mockRestore(),
the spy remains active and affects subsequent tests. Wrap the code that performs
the mocks and assertions (from the prisma mock setup through the expect
statements) in a try/finally block, moving consoleErrorSpy.mockRestore() to the
finally block to guarantee it runs regardless of whether any test assertion
throws an error.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 3bd31082-1dad-460f-a6ad-b0b6e2c223ff
⛔ Files ignored due to path filters (5)
package-lock.jsonis excluded by!**/package-lock.jsonpublic/uploads/avatars/1001/1780676986116_emhbae.jpgis excluded by!**/*.jpgpublic/uploads/avatars/1001/1781627118139_jtnw3e.jpgis excluded by!**/*.jpgpublic/uploads/avatars/1001/1781627224731_8t7syn.jpgis excluded by!**/*.jpgpublic/uploads/avatars/1001/1781627523291_lqgcv7.jpgis excluded by!**/*.jpg
📒 Files selected for processing (14)
app/api/annotations/route.tsapp/api/auth/sessions/__tests__/route.test.tsapp/api/upload/avatar/__tests__/route.test.tsapp/api/users/profile/__tests__/route.test.tsdiff.txtjest.config.cjslib/services/__tests__/data-residency.test.tslib/services/__tests__/repository-idor.test.tslib/services/gitService.tsnext.config.jspackage.jsonprs.jsonsrc/components/layout/__tests__/Navbar.test.tsxtest-results.json
| id: parseInt(repositoryId), | ||
| userId: user.userId, | ||
| } | ||
| }); | ||
|
|
||
| if (!repo) { | ||
| return NextResponse.json({ error: "Repository not found or access denied" }, { status: 403 }); | ||
| } | ||
|
|
||
| const annotations = await prisma.mapAnnotation.findMany({ | ||
| where: { | ||
| repositoryId: parseInt(repositoryId), |
There was a problem hiding this comment.
Validate repositoryId once with strict numeric checks before Prisma queries.
parseInt() on Line 20 and Line 31 can accept partial values (e.g. "12abc") and can also yield NaN, which can turn malformed client input into a server-side failure path instead of a clean 400.
Suggested fix
const { searchParams } = new URL(request.url);
const repositoryId = searchParams.get("repositoryId");
if (!repositoryId) {
return NextResponse.json({ error: "repositoryId is required" }, { status: 400 });
}
+ const repositoryIdNum = Number(repositoryId);
+ if (!Number.isInteger(repositoryIdNum) || repositoryIdNum <= 0) {
+ return NextResponse.json({ error: "Invalid repositoryId" }, { status: 400 });
+ }
// Verify user has access to the repository to prevent IDOR
const repo = await prisma.repository.findFirst({
where: {
- id: parseInt(repositoryId),
+ id: repositoryIdNum,
userId: user.userId,
}
});
if (!repo) {
return NextResponse.json({ error: "Repository not found or access denied" }, { status: 403 });
}
const annotations = await prisma.mapAnnotation.findMany({
where: {
- repositoryId: parseInt(repositoryId),
+ repositoryId: repositoryIdNum,
},🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@app/api/annotations/route.ts` around lines 20 - 31, Add strict numeric
validation for the repositoryId parameter at the beginning of the route handler,
before any Prisma queries. Use a validation method that rejects partial strings
and returns NaN for non-numeric inputs (such as checking if the string matches a
numeric pattern or using Number.isInteger after parsing). If repositoryId fails
validation, return a 400 Bad Request response. Then store the validated numeric
ID in a variable and reuse it in both the findUnique query (around line 20) and
the findMany query (around line 31) instead of calling parseInt multiple times,
which eliminates the risk of accepting malformed input like "12abc" and ensures
consistent validation.
| if (opts?.signal?.aborted) { | ||
| reject(new Error("Repository clone aborted")); | ||
| return; | ||
| } |
There was a problem hiding this comment.
Move the abort precheck before spawning git clone.
The check on Line 335 happens after Line 304 already starts the subprocess, so already-aborted jobs can still consume process/resources briefly. Put the abort guard before spawn(...).
Suggested fix
return new Promise((resolve, reject) => {
+ if (opts?.signal?.aborted) {
+ reject(new Error("Repository clone aborted"));
+ return;
+ }
+
const child = spawn("git", args, {
stdio: ["ignore", "pipe", "pipe"],
env: {
...process.env,
GIT_TERMINAL_PROMPT: "0",
GCM_INTERACTIVE: "Never",
GIT_LFS_SKIP_SMUDGE: "1",
},
timeout: GIT_CLONE_TIMEOUT_MS,
signal: opts?.signal,
});
...
- if (opts?.signal?.aborted) {
- reject(new Error("Repository clone aborted"));
- return;
- }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@lib/services/gitService.ts` around lines 335 - 338, Move the abort check that
validates opts?.signal?.aborted to occur before the spawn call, rather than
after. Currently the check happens after the git clone subprocess has already
been started, which wastes resources for already-aborted operations. Relocate
the conditional block containing the opts?.signal?.aborted check with its reject
call to execute before spawn is invoked, ensuring that aborted signals are
detected and handled before any subprocess resources are allocated.
| [ | ||
| "for-each-ref", | ||
| "--sort=-committerdate", | ||
| "--format=%(refname:short)|%(committerdate:iso)|%(objectname)", | ||
| "refs/heads/", | ||
| "refs/remotes/origin/", | ||
| ], | ||
| { timeout: DEFAULT_GIT_TIMEOUT_MS, signal }, | ||
| ); | ||
|
|
||
| const lines = stdout.trim().split("\n").filter(Boolean); | ||
| const seenBranches = new Set<string>(); | ||
| const refEntries: { name: string; fullName: string; date: string }[] = []; | ||
| let refEntries: { name: string; fullName: string; date: string }[] = []; | ||
|
|
||
| for (const line of lines) { | ||
| if (refEntries.length >= MAX_BRANCHES_TO_PROCESS) break; | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Does git for-each-refsupport--count=, and what is the minimum Git version that supports it?
💡 Result:
Yes, the git for-each-ref command supports the --count= option, which instructs the command to stop after displaying the specified number of references [1][2]. This option has been available since the command was first introduced. Historical records indicate that the --count option was part of the original implementation of git-for-each-ref [3], which was merged into the Git codebase in 2005 (early versions of Git) [3]. Therefore, it is supported in all versions of Git that include this command [3].
Citations:
- 1: https://git-scm.com/docs/git-for-each-ref
- 2: https://man7.org/linux/man-pages/man1/git-for-each-ref.1.html
- 3: https://rtime.ciirc.cvut.cz/gitweb/git.git/commitdiff/9f613ddd21cbd05bfc139d9b1551b5780aa171f6
Add --count flag to bound for-each-ref output at the Git command level.
Currently, for-each-ref emits all refs to stdout even though only the first MAX_BRANCHES_TO_PROCESS entries are used in JS. For repositories with large ref sets, this leaves git's upstream work and memory consumption unbounded. Add --count=${MAX_BRANCHES_TO_PROCESS} directly to the command to limit output at the source.
Suggested fix
const { stdout } = await this.spawnGit(
[
"for-each-ref",
"--sort=-committerdate",
+ `--count=${MAX_BRANCHES_TO_PROCESS}`,
"--format=%(refname:short)|%(committerdate:iso)|%(objectname)",
"refs/heads/",
"refs/remotes/origin/",
],
{ timeout: DEFAULT_GIT_TIMEOUT_MS, signal },
);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@lib/services/gitService.ts` around lines 435 - 451, The git for-each-ref
command in the gitService.ts file currently outputs all refs to stdout even
though only MAX_BRANCHES_TO_PROCESS entries are used. Add the
--count=${MAX_BRANCHES_TO_PROCESS} flag to the for-each-ref command array
(alongside the existing flags like --sort=-committerdate and --format) to limit
the output at the Git command level, preventing unnecessary work and memory
consumption for repositories with large ref sets.
| try { | ||
| withBundleAnalyzer = require('@next/bundle-analyzer')({ | ||
| enabled: process.env.ANALYZE === 'true', | ||
| }); | ||
| } catch { | ||
| withBundleAnalyzer = (config) => config; | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify whether next.config.js has a broad catch and no MODULE_NOT_FOUND guard.
# Expected for current code: broad catch present, no explicit module-not-found filter.
fd -a "^next\\.config\\.js$" | xargs -I{} sh -c '
echo "== {} ==";
rg -n "try|catch|`@next/bundle-analyzer`|MODULE_NOT_FOUND" "{}";
'Repository: nisshchayarathi/gitverse-nextjs
Length of output: 198
Narrow the fallback to only missing-module errors.
The broad catch block at line 21 swallows all exceptions, which can hide legitimate analyzer configuration errors and make failures nondeterministic. Catch only MODULE_NOT_FOUND for @next/bundle-analyzer, and rethrow everything else.
Suggested patch
let withBundleAnalyzer;
try {
withBundleAnalyzer = require('`@next/bundle-analyzer`')({
enabled: process.env.ANALYZE === 'true',
});
-} catch {
- withBundleAnalyzer = (config) => config;
+} catch (error) {
+ const isMissingAnalyzer =
+ error &&
+ error.code === 'MODULE_NOT_FOUND' &&
+ typeof error.message === 'string' &&
+ error.message.includes('`@next/bundle-analyzer`');
+
+ if (!isMissingAnalyzer) {
+ throw error;
+ }
+
+ withBundleAnalyzer = (config) => config;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| try { | |
| withBundleAnalyzer = require('@next/bundle-analyzer')({ | |
| enabled: process.env.ANALYZE === 'true', | |
| }); | |
| } catch { | |
| withBundleAnalyzer = (config) => config; | |
| } | |
| try { | |
| withBundleAnalyzer = require('`@next/bundle-analyzer`')({ | |
| enabled: process.env.ANALYZE === 'true', | |
| }); | |
| } catch (error) { | |
| const isMissingAnalyzer = | |
| error && | |
| error.code === 'MODULE_NOT_FOUND' && | |
| typeof error.message === 'string' && | |
| error.message.includes('`@next/bundle-analyzer`'); | |
| if (!isMissingAnalyzer) { | |
| throw error; | |
| } | |
| withBundleAnalyzer = (config) => config; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@next.config.js` around lines 17 - 23, The broad catch block in the
withBundleAnalyzer initialization swallows all exceptions, hiding legitimate
configuration errors. Narrow the catch block to specifically check if the caught
error is a MODULE_NOT_FOUND error (by examining the error code property); if it
is, set withBundleAnalyzer to the no-op fallback function, otherwise rethrow the
error to surface any genuine configuration issues.
What does this PR do?
Fixes a severe performance bottleneck and DoS vulnerability in the branch analysis phase.
By default, the application was executing a subprocess for every branch found in a repository to retrieve commit counts. For extraordinarily large repositories, this unbounded loop caused the server to hang or crash due to resource starvation.
This update strictly limits processing to the top 100 most recently updated branches by leveraging
git for-each-ref --sort=-committerdate, ensuring system stability without sacrificing the relevance of the parsed metrics.Related Issues
Checklist
MAX_BRANCHES_TO_PROCESS(100).close #2354
Summary by CodeRabbit
Release Notes
Bug Fixes
Dependencies
Tests