-
Notifications
You must be signed in to change notification settings - Fork 498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Affected hosts on Vulnerability does not match #24319
Comments
Found this old bug: #21848 This appears to be the same issue. This might happen because the count on the /software/vulnerabilities list view is populated from the It seems the host count on the /hosts/manage page is calculated on page load, so the numbers could be slightly different, depending on if something has changed on a few of the hosts since the last time the cron job has run. You can test this by finding vulnerable software currently installed on a host on the Software >> Vulnerabilities page, then on the host, deleting the software and refetching host vitals on the "My device" page. Once that has finished, navigate back to the Software >> Vulnerabilities page and filter by the CVE. The host count will have not updated on that page yet, but if you click "view affected hosts" the host for which you resolved the vulnerability will no longer be listed and the total count will be off by 1. Tagging @noahtalerman to confirm this behavior is expected based on the way the UI retrieves and displays information, or if further investigation into alternative methods are warranted. |
@jmwatts that's right. It's likely this is confusing UX rather than a bug. We can confirm that it's confusing UX (not a bug) by manually triggering the cron job and making sure the updated count on Vulnerabilities table matches the count on the Hosts page. |
Hey team! Please add your planning poker estimate with Zenhub @jacobshandling @RachelElysia |
Looks like running the |
@noahtalerman Maybe it makes sense to add a "host counts updated at" UI element to the individual vuln page...we have this in the API endpoint that already gets pulled for the vuln info...so it's more obvious when problems like this show up because the vulnerabilities cron hasn't run recently? |
Took a look at the numbers myself and there's a significant discrepancy even with a recent vulns cron run. Awaiting a DB dump so I can troubleshoot further, as I figure this'll require query introspection etc. |
For my own reference, the vuln being looked at is CVE-2024-44187, which is cross-platform and has a mix of OSes and software affected. |
So, pulling down this environment does get me differing host counts between the filter and the calculation on the vulnerability page. Catch is, if I rerun the vulns cron using What's weird here is the vulnerabilities cron is showing as completed hourly as expected in the DB snapshot. My gut feeling is that we're dealing with an incomplete vulns run due to DB inconsistencies because this environment has at times run off of Once we have a clean snapshot in these environments...and once we're running off of a build including #24914, we should be able to properly analyze this, but until those two items are closed out I think anything we see here will be a red herring, so putting this ticket on the back burner until those two are merged. |
@iansltx agreed this makes sense. I think not adding an "updated at" timestamp for the Vulnerability details page might have been a Product Design miss. It looks like we're also missing the "updated at" timetamp on the OS version page: And we're missing it on the Software version details page: We only have it on the Software title details page: @eugkuo I passed this bug to you. I think it's needs some Product Design help to determine where to put this timestamp. Maybe it shows up on hover over "Hosts"? |
@noahtalerman So, there may be a genuine underlying issue (beyond #24933) in the counts themselves, rather than just lack of visibility on when they're updated. Mind if I split off the UX changes from your comment into a new issue (still assigned to @eugkuo), while keeping this one for troubleshooting vuln counts in the QAWolf environment? I think that split also helps QAWolf keep track of things, vs. switching the scope of this ticket. |
UI improvements split :) |
So, I triggered the vulnerabilities job on the QAWolf environment and I think there's an assumption baked into the total count where it shouldn't be. Specifically, the total count seems to assume that for a given CVE a given host will either have an OS-level vulnerability or a software-level vulnerability, but not both. For this particular vuln both macOS and Safari have the vuln, so there are two hosts that are double-counted. Question is whether we're willing to take the performance hit for correctly counting the intersection of hosts and vulns cross-OS vs. just summing host counts, but I'm guessing the answer is "probably yes because we want these numbers to be accurate." Resolving that discrepancy gets us down to a difference of one host, which I narrowed down to a stale count issue I think (the off-by-one persists when clicking through on a Linux package, 662 hosts vs. 663, so I know what's causing it). Will validate this assumption shortly, at which point all discrepancies will have been accounted for. |
Looks like the issue is with the |
My previous diagnosis was incorrect here. The host count mismatch being the same as I'd expect from The actual reason for the off-by-a-few is that we have some number of host OS and host software entries that didn't get cleaned up in the QA environment when those hosts were deleted. This was likely a transient issue, as we do have those "manual cascades" in the deletion method in the hosts part of the data store code, so the fix here is to delete the orphaned rows manually: DELETE FROM host_software WHERE host_id NOT IN (SELECT id FROM hosts);
DELETE FROM host_operating_system WHERE host_id NOT IN (SELECT id FROM hosts); @rfairburn I'm assigning this ticket to you to run the above queries. Assign back to me when done and I'll trigger the crons and confirm that things match up (see below for context there). Another contributor to counts being off, at least locally, is the vulnerability host counts check using stale host OS count data. This is more obvious for host OS data because the mapping table there is also materialized by a cron ( With the following steps I got counts to match up across the board, on the vulnerability mentioned above (which includes hosts with both OS- and software-level vulns for the same CVE):
At that point I got 1305 hosts with the vulnerability:
All affected host listings (entire CVE, per-OS, per-software) matched the counts on the vulnerability detail page. I also spot-checked no-team and normal-team counts/filters for the same CVE and they match as well, so I'm confident that the above is the extent of this issue. |
|
Just reran Pinging QAWolf to verify on their end and hopefully close (since follow-on work for showing freshness indicators is tracked separately). |
Vulnerabilities, |
Steps to reproduce:
Ran from 11AM - 11:15AM
Expected: Expect number of hosts on page to match number of affected hosts in vulnerabilities page.
Actual: Number of hosts on page do not match number of affected hosts in vulnerabilities page.
Video:
https://www.loom.com/share/30f1b0a0b2684dadaf8a46b5b054ee45 (https://www.loom.com/share/30f1b0a0b2684dadaf8a46b5b054ee45)
The text was updated successfully, but these errors were encountered: