Skip to content

feat: add IaC security scanning with Checkov#6

Open
moghit-eou wants to merge 5 commits intoMedical-Informatics-Platform:mainfrom
moghit-eou:feat/iac-scanning-checkov
Open

feat: add IaC security scanning with Checkov#6
moghit-eou wants to merge 5 commits intoMedical-Informatics-Platform:mainfrom
moghit-eou:feat/iac-scanning-checkov

Conversation

@moghit-eou
Copy link
Copy Markdown
Contributor

@moghit-eou moghit-eou commented Mar 20, 2026

What this does

Adds a Checkov workflow that scans all Kubernetes, Helm, and Kustomize
files on every PR for security misconfigurations.

Known limitation ( security gap ): skip_check

CKV_K8S_8, CKV_K8S_9, CKV_K8S_35 are globally skipped to avoid false positives on ArgoCD
which are not Kubernetes workloads and have no containers. You can find all the Checkov's rules in this link

I'm still working through the codebase to identify exactly which ArgoCD files need to be modified and I will address this in a follow up PR.

soft_fail: true

mip-infra has never been scanned before that why the first run will surface existing
findings in current files. soft_fail=true makes issues visible without
blocking all open PRs immediately.

  • To be removed once existing findings are resolved.
image

@github-advanced-security
Copy link
Copy Markdown

You are seeing this message because GitHub Code Scanning has recently been set up for this repository, or this pull request contains the workflow file for the Code Scanning tool.

What Enabling Code Scanning Means:

  • The 'Security' tab will display more code scanning analysis results (e.g., for the default branch).
  • Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results.
  • You will be able to see the analysis results for the pull request's branch on this overview once the scans have completed and the checks have passed.

For more information about GitHub Code Scanning, check out the documentation.

@jdaln
Copy link
Copy Markdown
Contributor

jdaln commented Mar 20, 2026

Nice PR, thank you again @moghit-eou .
If that's ok with you, I will process it next week when I get some time in the middle of Kubecon. I want to look into the rules that were excluded for example. Looks totally good from what I see but I want to do a real review.

@jdaln
Copy link
Copy Markdown
Contributor

jdaln commented Mar 30, 2026

Hi @moghit-eou,
Sorry for taking so much time. The conference was busier than expected.
Most of the failures right now are on a part that is deemed to be removed in the next days, if everything goes well, when we do a version upgrade.
submariner-io/submariner-charts#695

Now for the following, what was the logic behind excluding them fully? Is there an option for exceptions instead maybe?

  • CKV_K8S_35: Prefer using secrets as files over secrets as environment variables
  • CKV_K8S_8: Liveness probes should be configured
  • CKV_K8S_9: Readiness Probe Should be Configured

Finally, Checkov now only scans this repo and not the dependencies it pulls. While this is a simple approach that can be remediated by adding checkov to other MIP repositories, what happens for other dependencies? I have to say that I don't yet have a solution for this but this is something we will need to think about and that is not blocking for this PR>

@moghit-eou
Copy link
Copy Markdown
Contributor Author

Hi @jdaln,

Hope the conference went well.

On the Submariner failures: thanks for the context, good to know those should be cleaned up with the version upgrade.

On the global skips: My initial approach was too broad. I used those three skips as a temporary way to unblock the first scan, but I intended to narrow them down properly.

After a quick reviewing the repo and the relevant Checkov references, here is what I found:

CKV_K8S_8 / CKV_K8S_9 are workload checks for liveness and readiness probes. In this repo, they are not meaningful on ArgoCD control resources such as ApplicationSet and AppProject, since those objects do not define runtime containers. There is also a confirmed Checkov false-positive issue for these checks on init containers, which cannot have probes by Kubernetes design (CKV_K8S_8 & CKV_K8S_9 false positives#4807).

CKV_K8S_35 is also a workload-level check: it evaluates how running containers consume secrets. For the same reason, it is not meaningful on ArgoCD control resources that do not define runtime containers or environment-variable based secret injection. The full scope of what this check targets is documented in the Checkov Kubernetes policy index.

On exceptions instead of global skips: Checkov supports inline suppression at the individual resource level (suppression docs), so my proposed fix is to remove the global skip_check from the workflow and add targeted inline comments only on the ArgoCD resource files where these workload checks do not apply.
( This is just an example , not sure yet exactly how it will be implemented )

# checkov:skip=CKV_K8S_8: ArgoCD control resource - no runtime containers defined
# checkov:skip=CKV_K8S_9: ArgoCD control resource - no runtime containers defined
# checkov:skip=CKV_K8S_35: ArgoCD control resource - no runtime containers defined
apiVersion: argoproj.io/v1alpha1
kind: Application
...

This way, real workload manifests such as common/nginx-ingress/manifests/nginx-public-deployment.yaml remain fully checked without broad exceptions.

On dependency scanning: you are right that Checkov currently only scans this repository, not the external Helm charts it references. I agree that is a separate gap worth tracking, and adding checks upstream in the MIP-related repositories seems like the cleanest direction.

Does that approach make sense?

Sorry, I can't provide more detailed approach I have an university exam this week so I have not had the chance to fully work through the implementation yet but I will take a closer look once that is done and update the PR in the weekend.

@jdaln
Copy link
Copy Markdown
Contributor

jdaln commented Mar 31, 2026

@moghit-eou it did, thank you! And I have some takeaways in terms of runtime security that will be interesting to look at I believe :) .

This approach is excellent. Let's wait for Google to communicate the results for the GSoC at the end of the month before moving further for now I would say. Good luck for your exams!

@moghit-eou
Copy link
Copy Markdown
Contributor Author

Hi @jdaln,

I went back and tested everything properly locally using Checkov.
My previous reasoning was not solid enough. I was working from initial references that I now think may not have fully applied to this specific setu p(some of those docs/articles are outdated), and I drew conclusions. I went back, ran the scans properly, and here is what I actually found.


What I got wrong

  • CKV_K8S_35 was never actually a problem. Every single file passes it.
  • CKV_K8S_8/9 are NOT firing on ArgoCD Application/AppProject resources and Checkov ignores those entirely .
Example: Checkov skipping ArgoCD resources
checkov --file base/argo-project.yaml

output :

[ kubernetes framework ]: 100%|████████████████████|[1/1], Current File Scanned=base/argo-projects.yaml
[ ansible framework ]: 100%|████████████████████|[1/1], Current File Scanned=base/argo-projects.yaml
[ secrets framework ]: 100%|████████████████████|[1/1], Current File Scanned=base/argo-projects.yaml

       _               _
   ___| |__   ___  ___| | _______   __
  / __| '_ \ / _ \/ __| |/ / _ \ \ / /
 | (__| | | |  __/ (__|   < (_) \ V /
  \___|_| |_|\___|\___|_|\_\___/ \_/

By Prisma Cloud | version: 3.2.513


mip-infra on  main [?]

This means my earlier claim that "CKV_K8S_8/9 are not meaningful on ArgoCD control resources" was only partially right, those checks do not fire on pure ArgoCD (Application, AppProject, ApplicationSet), but they do fire on the actual Deployment/StatefulSet manifests that ArgoCD installs to run itself, which live under argo-setup/patches/.

What I ran locally

1. Full repository scan

checkov -d . > all_checkov..txt 
kubernetes scan results:
  Passed checks: 816, Failed checks: 154, Skipped checks: 0

secrets scan results:
  Passed checks: 0, Failed checks: 1, Skipped checks: 0

github_actions scan results:
  Passed checks: 49, Failed checks: 3, Skipped checks: 0

2. argo-setup/ folder only

checkov -d argo-setup > argo-setup_checkov.txt
kubernetes scan results:
  Passed checks: 385, Failed checks: 83, Skipped checks: 0

clear that more than half of the total failing checks (83 out of 154) in the whole repo come from a single folder: argo-setup/patches/.

3.argo-setup/ After rendering through Kustomize

kustomize build argo-setup/patches > argo-setup-rendered.yaml
checkov -f argo-setup-rendered.yaml > kustomize_patches_checkov.txt
kubernetes scan results:
  Passed checks: 796, Failed checks: 69, Skipped checks: 0

When running kustomize build argo-setup/patches, the following warning
appeared:

Warning: 'patchesStrategicMerge' is deprecated. Please use 'patches'
instead. Run 'kustomize edit fix' to update your Kustomization automatically.

The argo-setup folder accounts for 83 of the 154 Kubernetes failures

When I looked at argo-setup_checkov.txt closely, it became clear that more
than half of the total failing checks in the whole repo come from a single
folder
: argo-setup/patches/. Five patch files in particular each
contribute 16 failures on their own:

File Failures
patch-argocd-application-controller-resources.yaml 16
patch-argocd-dex-server-resources.yaml 16
patch-argocd-redis-ha-statefulset.yaml 16
patch-argocd-repo-server-resources.yaml 16
patch-argocd-server-resources.yaml 16
patch-argocd-cmd-params-cm.yaml 1
patch-argocd-cm.yaml 1
patch-argocd-application-controller-clusterrole.yaml 1
Total number 83

Some of the top failing rules relate to probes, image configuration, container security hardening, and RBAC permissions.

These files are Kustomize StrategicMergePatches, listed under
patchesStrategicMerge in argo-setup/patches/kustomization.yaml.

When Checkov scans these standalones files ,I think it may be treating each
file as a complete, standalone Deployment definition and flagging
everything that is "missing" (no probes, no security context, no image
tag, no namespace, etc.). That is likely why these 5 files each produce 16
failures.


The changed after kustomize build on argo-setup folder.

Overall: failures dropped from 83 (argo-setup standalone) → 69
(rendered output).

This could suggest that the probes for these components are defined in
the upstream ArgoCD HA manifest and only become visible once the patches
are merged into it, which would make the original failures on the raw
patch fragments false positives. Though I want to be careful here; I have
not confirmed this by reading the upstream manifest directly, so I may be
wrong about the exact reason.

For CKV_K8S_8 and CKV_K8S_9 specifically:

Before rendering, all five resources from the patch files failed these two
checks. After rendering, three of them now pass:
Deployment.argocd-mip-team.argocd-repo-server,
Deployment.argocd-mip-team.argocd-server, and
Deployment.argocd-mip-team.argocd-redis-ha-haproxy.

For CKV_K8S_35 (prefer secrets as files over environment variables):

this rule passes on every file in the overall scan (
all_checkov.txt — 0 failures, 9 passes total). But after kustomize build and
scanning the rendered output, it fails on 5 upstream ArgoCD resources.


Findings outside argo-setup (from all_checkov.txt)

The remaining 71 failures in the full repository scan come from:


File Failures Notes
common/nginx-ingress/test-ingress.yaml 23 Test deployment, no probes, no resource limits, default namespace and likely genuine findings on a non-production file
deployments/.../exareme2-deployment-patch.yaml 16 Another Kustomize patch file , similar situation to argo-setup patches, may be false positives
common/submariner/broker/copy-secret-hook.yaml 16 A Job resource missing resource limits, security context, etc. could be genuine
common/nginx-ingress/manifests/nginx-public-deployment.yaml 12 Real workload , some findings may be genuinely worth addressing
common/submariner/operator/kustomization.yaml 1 CKV_SECRET_6 , Base64 high entropy string (IPSec PSK)
.github/workflows/lint-yaml.yml 2 CKV2_GHA_1 (missing top-level permissions) and CKV_GHA_2 (shell injection risk on git push)
.github/workflows/check-main-revisions.yml 1 CKV2_GHA_1 (missing top-level permissions)

I have not yet done a full analysis of each of these. Before touching
them, I want to understand which are genuine findings worth addressing and
which may be similar false positives caused by scanning partial Kustomize
documents.


Proposed changes to the workflow

  • Given all of the above,I currently think makes sense
    is Replacing the global skip_check with skip_path: argo-setup/patches, seems like a reasonable next step ,
    though I am not fully certain this is the right approach yet. The idea
    is to scope the exclusion to where the false positives are most likely
    occurring

  • This scopes the exclusion to where the false positives are actually
    occurring the patch fragment files rather than suppressing those rules
    globally across the entire repo.

  • Before touching findings, I'll try to set up
    act to test workflow changes locally
    end-to-end.

Attached scan outputs

I uploaded the three raw scan outputs attached below / linked here for
reference. They contain the full check-by-check results that informed
everything above:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants