Skip to content

Commit c9ed91b

Browse files
committed
Enhance security guidelines for Cluster API users
Signed-off-by: nayuta-ai <[email protected]>
1 parent eb4e38c commit c9ed91b

File tree

1 file changed

+88
-0
lines changed

1 file changed

+88
-0
lines changed
Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
# Security Guidelines for Cluster API Users
2+
3+
This document compiles security best practices for using Cluster API. We recommend that organizations adapt these guidelines to their specific infrastructure and security requirements to ensure safe operations.
4+
5+
## Comprehensive auditing
6+
7+
To ensure comprehensive auditing, the following components require audit configuration:
8+
9+
- **Cluster-level Auditing**
10+
- Auditing on the management cluster
11+
- API server auditing for all workload clusters
12+
13+
- **Node/VM-level Auditing**
14+
- Audit KubeConfig files access that are located on the node
15+
- Audit access or edits to CA private keys and cert files located on the node
16+
17+
- **Cloud Provider Auditing**
18+
- Cloud API auditing to log all actions performed using cloud credentials
19+
20+
After configuring these audit sources, centralize the logs using aggregation tools and implement real-time monitoring and alerting to detect suspicious activities and security incidents.
21+
22+
## Use least privileges
23+
24+
To minimize security risks related to cloud provider access, create dedicated cloud credentials that have only the necessary permissions to manage the lifecycle of a cluster. Avoid using administrative or root accounts for Cluster API operations, and use separate credentials for different purposes such as management cluster versus workload clusters.
25+
26+
## Limit access
27+
28+
Implement access restrictions to protect cluster infrastructure.
29+
30+
### Control Plane Protection
31+
32+
Limit who can create pods on control plane nodes through multiple methods:
33+
34+
- **Taints and Tolerations**: Apply `NoSchedule` taints to control plane nodes to prevent general workload scheduling. See [Kubernetes Taints and Tolerations documentation](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/)
35+
- **RBAC Policies**: Restrict pod creation permissions using Role-Based Access Control. See [Kubernetes RBAC documentation](https://kubernetes.io/docs/reference/access-authn-authz/rbac/)
36+
- **Admission Controllers**: Implement admission webhooks to enforce pod placement policies. See [Dynamic Admission Control](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/)
37+
38+
### SSH Access
39+
40+
Disable or restrict SSH access to nodes in a cluster to prevent unauthorized modifications and access to sensitive files.
41+
42+
## Second pair of eyes
43+
44+
Implement a review process where at least two people must approve privileged actions such as creating, deleting, or updating clusters. GitOps provides an effective way to enforce this requirement through pull request workflows, where changes to cluster configurations must be reviewed and approved by another team member before being merged and applied to the infrastructure.
45+
46+
## Implement comprehensive alerting
47+
48+
Configure alerts in the centralized audit log system to detect security incidents and resource anomalies.
49+
50+
### Security Event Monitoring
51+
52+
- Alert when cluster API components are modified, restarted, or experience unexpected state changes
53+
- Monitor and alert on unauthorized changes to sensitive files on machine images
54+
- Alert on unexpected machine restarts or shutdowns
55+
- Monitor deletion or modification of Elastic Load Balancers (ELB) for API servers
56+
57+
### Resource Activity Monitoring
58+
59+
- Alert on all cloud resource creation, update, and deletion activities
60+
- Identify anomalous patterns such as mass resource creation or deletion
61+
- Monitor for resources created outside expected boundaries
62+
63+
### Resource Limit Monitoring
64+
65+
- Alert when the number of clusters approaches or exceeds defined soft limits
66+
- Monitor node creation rates and alert when approaching capacity limits
67+
- Track usage against cloud provider quotas and organizational limits
68+
- Alert on excessive API calls or resource creation requests
69+
70+
## Cluster isolation and segregation
71+
72+
Implement multiple layers of isolation to prevent privilege escalation from workload clusters to management cluster.
73+
74+
### Account/Subscription Separation
75+
76+
Separate workload clusters into different AWS accounts or Azure subscriptions, and use dedicated accounts for management cluster and production workloads. This approach provides a strong security boundary at the cloud provider level.
77+
78+
### Network Boundaries
79+
80+
Separate workload and management clusters at the network level through VPC boundaries. Use dedicated VPC/VNet for each cluster type to prevent lateral movement between clusters.
81+
82+
### Certificate Authority Isolation
83+
84+
Do not build a chain of trust for cluster CAs. Each cluster must have its own independent CA to ensure that workload cluster CA compromise does not provide access to the management cluster. See [Kubernetes PKI certificates and requirements](https://kubernetes.io/docs/setup/best-practices/certificates/) for best practices.
85+
86+
## Prevent runtime updates
87+
88+
Implement controls to prevent tampering of machine images at runtime. Disable or restrict updates to machine images at runtime and prevent unauthorized modifications through SSH access restrictions. Following [immutable infrastructure](https://glossary.cncf.io/immutable-infrastructure/) practices ensures that any changes require deploying new images rather than modifying running systems.

0 commit comments

Comments
 (0)