-
Notifications
You must be signed in to change notification settings - Fork 447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize vulnerability host counts #24914
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #24914 +/- ##
==========================================
+ Coverage 63.56% 63.80% +0.23%
==========================================
Files 1602 1608 +6
Lines 151820 153090 +1270
Branches 3952 3952
==========================================
+ Hits 96511 97673 +1162
- Misses 47624 47625 +1
- Partials 7685 7792 +107
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
In my opinion, we should not use concurrent batches for this DB access because it will put an extra load on the DB reader. We know that the vulnerability cron uses a lot of CPU/DB resources, so the goal should be to smooth out the performance spikes. One way to do that is to pause for some time (500ms?) between each batch. |
Counterpoint to the above: smaller batches should help DB load massively, and we can expose an override for concurrency as an env var (with a default that we've confirmed as working in load test) so customers can tune how hard the tool hits the DB. My current guess is that this will be significantly lighter on the DB in load test, even concurrently, than the old massive temp table method (and I think we'll get useful info from loadtest here), and with the env var in place we can tune things easily enough once this hits production workloads. |
Also, given that we're talking about 5 concurrent sets of queries, if someone is going over prepared statement maximums based on these changes:
|
Started a Slack thread about which data sets to use to test this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only feedback here is on the routines naming. I also see where the host count is bugged but I'll fix that in a PR stacked on top of this one.
@@ -1257,6 +1258,11 @@ func (man Manager) addConfigs() { | |||
false, | |||
"Don't sync installed Windows updates nor perform Windows OS vulnerability processing.", | |||
) | |||
man.addConfigInt( | |||
"vulnerabilities.max_routines", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"vulnerabilities.max_routines", | |
"vulnerabilities.max_concurrency", |
Seems like "concurrency" is more self-evident here. Guessing you cycled through that as a naming idea here, so it'd be useful to understand why this naming convention won.
#22364
Batching selects aggregating host counts based on CVE and adding concurrency.
changes/
,orbit/changes/
oree/fleetd-chrome/changes
.See Changes files for more information.
SELECT *
is avoided, SQL injection is prevented (using placeholders for values in statements)