Optimize vulnerability host counts #24914

mostlikelee · 2024-12-19T18:10:26Z

Batching selects aggregating host counts based on CVE and adding concurrency.

Changes file added for user-visible changes in changes/, orbit/changes/ or ee/fleetd-chrome/changes.
See Changes files for more information.
Input data is properly validated, SELECT * is avoided, SQL injection is prevented (using placeholders for values in statements)
Added/updated tests
Manual QA for all new/changed functionality

codecov · 2024-12-19T18:30:00Z

Codecov Report

Attention: Patch coverage is 81.16883% with 29 lines in your changes missing coverage. Please review.

Project coverage is 63.80%. Comparing base (e78bf6e) to head (0d43ea6).
Report is 102 commits behind head on main.

Files with missing lines	Patch %	Lines
server/datastore/mysql/vulnerabilities.go	82.06%	19 Missing and 7 partials ⚠️
cmd/fleet/cron.go	0.00%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #24914      +/-   ##
==========================================
+ Coverage   63.56%   63.80%   +0.23%     
==========================================
  Files        1602     1608       +6     
  Lines      151820   153090    +1270     
  Branches     3952     3952              
==========================================
+ Hits        96511    97673    +1162     
- Misses      47624    47625       +1     
- Partials     7685     7792     +107

Flag	Coverage Δ
backend	`64.61% <81.16%> (+0.24%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

getvictor · 2024-12-19T19:42:22Z

In my opinion, we should not use concurrent batches for this DB access because it will put an extra load on the DB reader.

We know that the vulnerability cron uses a lot of CPU/DB resources, so the goal should be to smooth out the performance spikes. One way to do that is to pause for some time (500ms?) between each batch.

iansltx · 2024-12-19T19:47:21Z

Counterpoint to the above: smaller batches should help DB load massively, and we can expose an override for concurrency as an env var (with a default that we've confirmed as working in load test) so customers can tune how hard the tool hits the DB. My current guess is that this will be significantly lighter on the DB in load test, even concurrently, than the old massive temp table method (and I think we'll get useful info from loadtest here), and with the env var in place we can tune things easily enough once this hits production workloads.

iansltx · 2024-12-19T19:48:55Z

Also, given that we're talking about 5 concurrent sets of queries, if someone is going over prepared statement maximums based on these changes:

They were running too hot to begin with
They can adjust concurrency down via the proposed env var

iansltx · 2024-12-20T19:08:22Z

Started a Slack thread about which data sets to use to test this.

iansltx

Only feedback here is on the routines naming. I also see where the host count is bugged but I'll fix that in a PR stacked on top of this one.

iansltx · 2024-12-24T00:30:53Z

server/config/config.go

@@ -1257,6 +1258,11 @@ func (man Manager) addConfigs() {
 		false,
 		"Don't sync installed Windows updates nor perform Windows OS vulnerability processing.",
 	)
+	man.addConfigInt(
+		"vulnerabilities.max_routines",


Suggested change

"vulnerabilities.max_routines",

"vulnerabilities.max_concurrency",

Seems like "concurrency" is more self-evident here. Guessing you cycled through that as a naming idea here, so it'd be useful to understand why this naming convention won.

mostlikelee added 7 commits December 17, 2024 13:30

batch global counts

0f34086

global count concurrency

4771c0e

concurrent no team counts

0728f11

batch team counts

d2efbb3

remove duration logging

b965701

refactor

9052c4e

changelog

94a349a

mostlikelee requested a review from a team as a code owner December 19, 2024 18:10

mostlikelee temporarily deployed to Docker Hub December 19, 2024 18:10 — with GitHub Actions Inactive

mostlikelee temporarily deployed to Docker Hub December 19, 2024 19:35 — with GitHub Actions Inactive

iansltx mentioned this pull request Dec 19, 2024

Affected hosts on Vulnerability does not match #24319

Closed

lucasmrod assigned iansltx Dec 23, 2024

mostlikelee marked this pull request as draft December 23, 2024 18:11

add max routines config

0d43ea6

mostlikelee temporarily deployed to Docker Hub December 23, 2024 20:24 — with GitHub Actions Inactive

mostlikelee marked this pull request as ready for review December 23, 2024 20:53

mostlikelee mentioned this pull request Dec 23, 2024

Missing software titles in software tab #24087

Closed

iansltx reviewed Dec 24, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize vulnerability host counts #24914

Optimize vulnerability host counts #24914

mostlikelee commented Dec 19, 2024 •

edited by iansltx

Loading

codecov bot commented Dec 19, 2024 •

edited

Loading

getvictor commented Dec 19, 2024

iansltx commented Dec 19, 2024

iansltx commented Dec 19, 2024

iansltx commented Dec 20, 2024

iansltx left a comment

iansltx Dec 24, 2024

	"vulnerabilities.max_routines",
	"vulnerabilities.max_concurrency",

Optimize vulnerability host counts #24914

Are you sure you want to change the base?

Optimize vulnerability host counts #24914

Conversation

mostlikelee commented Dec 19, 2024 • edited by iansltx Loading

codecov bot commented Dec 19, 2024 • edited Loading

Codecov Report

getvictor commented Dec 19, 2024

iansltx commented Dec 19, 2024

iansltx commented Dec 19, 2024

iansltx commented Dec 20, 2024

iansltx left a comment

Choose a reason for hiding this comment

iansltx Dec 24, 2024

Choose a reason for hiding this comment

mostlikelee commented Dec 19, 2024 •

edited by iansltx

Loading

codecov bot commented Dec 19, 2024 •

edited

Loading