Skip to content

feat: allow to scan secrets without buffering whole lines #6318

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

pieh
Copy link
Contributor

@pieh pieh commented May 16, 2025

🎉 Thanks for submitting a pull request! 🎉

Summary

Possible fix for https://linear.app/netlify/issue/FRB-1778/oom-during-ef-bundling (it's not really confirmed, other than logs pointing to secret scanning being a problem and buffering whole very large lines seems like only thing that I managed to reproduce OOM problems with)

For review, I suggest checking commits 1 by 1 as well as toggling "Ignore whitespace changes" in diff viewer.

Description

This adds a new scanning method that is not relying on buffering whole lines in scanned files (through node:readline) and instead allow to operate on smaller chunks of stream ( max length of secrets to scan + whatever size stream.on('data') would result in)

Because this is new method I want to be careful and introduce feature flag that opt into new search algorithm (feature flag not yet created in devcycle, because I don't really like secret_scanning_no_readline name (inverted boolean making it confusing) but didn't come up with better name yet. Assuming no problems are discovered as feature flag is rolled out, this should be cleaned up from the code.

I did run some benchmarks against netlify-react-ui build dir ( 48459 files to scan there) using https://gist.github.com/pieh/a3eccd67cd0c27017acbbf18a87a1c55 and the results are comparable so it should not result in performance regression:

readline x 0.25 ops/sec ±1.29% (31 runs sampled)
NO readline x 0.28 ops/sec ±0.87% (31 runs sampled)
Fastest is NO readline

Goal of the change was not to improve performance, but rather allow secret scanning to work with less available memory and allow Node's garbage collection to function without causing perf regression.

Tests
  • (first commit) I converted all snapshot tests for secret scanning to explicit assertions (needed to run same tests in multiple variants without causing problems and just make it more explicit on what actually is being asserted)
  • (second commit) Added new test (and fixture) with purposefully limited heap size which does fail with OOM error that does look like same error as on customer builds (that does not guarantee it's same problem, as all OOM errors would look similar)
  • (third commit) together with new implementation I made test setup run all existing tests with both implementations and made test added in second commit only run with new implementation (as it will never pass with current implementation)

For us to review and ship your PR efficiently, please perform the following steps:

  • Open a bug/issue before writing your code 🧑‍💻. This ensures
    we can discuss the changes and get feedback from everyone that should be involved. If you`re fixing a typo or
    something that`s on fire 🔥 (e.g. incident related), you can skip this step.
  • Read the contribution guidelines 📖. This ensures
    your code follows our style guide and passes our tests.
  • Update or add tests (if any source code was changed or added) 🧪
  • Update or add documentation (if features were changed or added) 📝
  • Make sure the status checks below are successful ✅

A picture of a cute animal (not mandatory, but encouraged)

Copy link
Contributor

This pull request adds or modifies JavaScript (.js, .cjs, .mjs) files.
Consider converting them to TypeScript.

@pieh pieh changed the title Mem/smaller chunks to scan feat: allow to scan secrets without buffering whole lines May 16, 2025
@pieh pieh force-pushed the mem/smaller-chunks-to-scan branch from 9b3e60a to fc32c24 Compare May 16, 2025 11:08
Copy link
Contributor

This pull request adds or modifies JavaScript (.js, .cjs, .mjs) files.
Consider converting them to TypeScript.

Copy link
Contributor

This pull request adds or modifies JavaScript (.js, .cjs, .mjs) files.
Consider converting them to TypeScript.

Copy link
Contributor

This pull request adds or modifies JavaScript (.js, .cjs, .mjs) files.
Consider converting them to TypeScript.

@pieh pieh marked this pull request as ready for review May 16, 2025 12:31
@pieh pieh requested a review from a team as a code owner May 16, 2025 12:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant