forked from cilium/cilium
-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1.13: Upgrade to v1.13.14 #552
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Based on https://github.com/cilium/cilium/actions/runs/7893881735. I had to copy these digests manually because image-digest-output.txt-v1.13.12 was empty. ## Docker Manifests ### cilium `docker.io/cilium/cilium:v1.13.12@sha256:d99204aa7b3b7bd2c9ab47fd398cc9f40290799bc0c7a4386c8dc5c1780cd3d3` `quay.io/cilium/cilium:v1.13.12@sha256:d99204aa7b3b7bd2c9ab47fd398cc9f40290799bc0c7a4386c8dc5c1780cd3d3` ### clustermesh-apiserver `docker.io/cilium/clustermesh-apiserver:v1.13.12@sha256:f1b1d0a85bab65e7d6adc90d000513a56ac58bdb071aa391a8580d73d20e6b6a` `quay.io/cilium/clustermesh-apiserver:v1.13.12@sha256:f1b1d0a85bab65e7d6adc90d000513a56ac58bdb071aa391a8580d73d20e6b6a` ### docker-plugin `docker.io/cilium/docker-plugin:v1.13.12@sha256:142822829c4bccd315e6b21262b9de9084c4913420a28bedb1209fa1f3e6bdb9` `quay.io/cilium/docker-plugin:v1.13.12@sha256:142822829c4bccd315e6b21262b9de9084c4913420a28bedb1209fa1f3e6bdb9` ### hubble-relay `docker.io/cilium/hubble-relay:v1.13.12@sha256:01b23ea40bcd81145dde6bfcbfc4d542749d08c2a1c6348954c85123a8d2b1fc` `quay.io/cilium/hubble-relay:v1.13.12@sha256:01b23ea40bcd81145dde6bfcbfc4d542749d08c2a1c6348954c85123a8d2b1fc` ### operator-alibabacloud `docker.io/cilium/operator-alibabacloud:v1.13.12@sha256:56fbfd0fb9ba239191cff2daf60b09e7f6b296622d0bdd5247ee535b9fe526c5` `quay.io/cilium/operator-alibabacloud:v1.13.12@sha256:56fbfd0fb9ba239191cff2daf60b09e7f6b296622d0bdd5247ee535b9fe526c5` ### operator-aws `docker.io/cilium/operator-aws:v1.13.12@sha256:d6c9c830ac558624568af0de46562f059f1085ece47323da108d7b13a686ca4f` `quay.io/cilium/operator-aws:v1.13.12@sha256:d6c9c830ac558624568af0de46562f059f1085ece47323da108d7b13a686ca4f` ### operator-azure `docker.io/cilium/operator-azure:v1.13.12@sha256:67a574ee88fb720cab3531722d95e90a0f98a17ca5c134a56d82840ad29fbf31` `quay.io/cilium/operator-azure:v1.13.12@sha256:67a574ee88fb720cab3531722d95e90a0f98a17ca5c134a56d82840ad29fbf31` ### operator-generic `docker.io/cilium/operator-generic:v1.13.12@sha256:f83734bbe270f961d545c7929152785507ce04a05d818ebc9776941723736d02` `quay.io/cilium/operator-generic:v1.13.12@sha256:f83734bbe270f961d545c7929152785507ce04a05d818ebc9776941723736d02` ### operator `docker.io/cilium/operator:v1.13.12@sha256:cfc2c4f2adebd878a5a5d2964139fd3afbd56fba219a68000e71162c2e946cc8` `quay.io/cilium/operator:v1.13.12@sha256:cfc2c4f2adebd878a5a5d2964139fd3afbd56fba219a68000e71162c2e946cc8` Signed-off-by: Michi Mutsuzaki <[email protected]>
Signed-off-by: renovate[bot] <[email protected]>
…9569c2 Signed-off-by: renovate[bot] <[email protected]>
Signed-off-by: Cilium Imagebot <[email protected]>
Signed-off-by: renovate[bot] <[email protected]>
We currently don't update bpf-next kernels, since we don't want to cause additional regressions. This is a problem since lvh updates may break existing images. Instead of relying on bpf-next, use the current LTS kernel in workflows. Signed-off-by: Lorenz Bauer <[email protected]>
LVH images haven't been updated since April, probably because of missing renovate annotations. Update them manually and add the annotations. Signed-off-by: Lorenz Bauer <[email protected]>
6.0 is EOL and therefore not built anymore. Switch to the next newer LTS which is 6.1. Signed-off-by: Lorenz Bauer <[email protected]>
[ upstream commit c19a84e ] This error can happen if a state is being destroyed while packets are in flight. It should be rare as the window in the kernel where it can happen is very short. Signed-off-by: Paul Chaignon <[email protected]> Signed-off-by: Tam Mach <[email protected]>
[ upstream commit 3c479d4 ] The test output are riddled with logs such as: Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init) This gets particularly noisy when waiting for the key rotation to complete, during which time we run kubectl exec repeatedly. This commit fixes it. Signed-off-by: Paul Chaignon <[email protected]> Signed-off-by: Tam Mach <[email protected]>
[ upstream commit dc6cf34 ] While fixing one of the review comments in PR that introduced this test, I changed datapath mode to be explicitly set from matrix.mode. Unfortunately, setting `native` makes it actually use `tunneling` mode. Switching to `gke` mode resolves this issue. Fixes cilium#30247 Signed-off-by: Marcel Zieba <[email protected]> Signed-off-by: Tobias Klauser <[email protected]>
[ upstream commit d7f5e58 ] In the AKS release cycle, a gap exists between the introduction of new supported Kubernetes versions and the removal of older versions, leading to failures in scheduled tests. This PR introduces the capability to disable older Kubernetes versions, mitigating test failures. Signed-off-by: Birol Bilgin <[email protected]> Signed-off-by: Tobias Klauser <[email protected]>
[ upstream commit 14d68f2 ] This commit revises the Kubernetes versions tested for compatibility across all supported cloud providers. Additionally, it adjusts the default Kubernetes version to match the default version provided by each cloud provider Signed-off-by: Birol Bilgin <[email protected]> Signed-off-by: Tobias Klauser <[email protected]>
[ upstream commit bb81c06 ] The current process delegates the review of ariane-config.yaml changes to the contributing group. With this commit reviewing responsibilities be transferred to the github-sec and ci-structure groups. Signed-off-by: Birol Bilgin <[email protected]> Signed-off-by: Tobias Klauser <[email protected]>
[ upstream commit b19321e ] This commit updates the Ariane configuration to include the GitHub organization team 'organization-members' in the list of allowed teams. Consequently, only members of this specific team will have the authorization to initiate test runs via issue comments. Signed-off-by: Birol Bilgin <[email protected]> Signed-off-by: Tobias Klauser <[email protected]>
Signed-off-by: renovate[bot] <[email protected]>
Signed-off-by: renovate[bot] <[email protected]>
Signed-off-by: Cilium Imagebot <[email protected]>
…21.111541 Signed-off-by: renovate[bot] <[email protected]>
Signed-off-by: renovate[bot] <[email protected]>
Signed-off-by: Cilium Imagebot <[email protected]>
[ upstream commit 6fee46f ] [ backporter's note: - test-e2e-upgrade doesn't exit on this branch. Remove it. - Minor conflict in tests-clustermesh-upgrade.yaml ++<<<<<<< HEAD + if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then + SHA="${{ inputs.SHA }}" + else + SHA="${{ github.sha }}" + fi ++======= + CILIUM_DOWNGRADE_VERSION=$(contrib/scripts/print-downgrade-version.sh stable) + echo "downgrade_version=${CILIUM_DOWNGRADE_VERSION}" >> $GITHUB_OUTPUT ++>>>>>>> 20a5826c31 (ci/ipsec: Fix downgrade version retrieval) ] Figuring out the right "previous patch release version number" to downgrade to in print-downgrade-version.sh turns out to be more complex than expected [0][1][2][3]. This commit is an attempt to 1) fix issues with the current script and 2) overall make the script clearer, so we can avoid repeating these mistakes. As for the fixes, there are two things that are not correct with the current version. First, we're trying to validate the existence of the tag to downgrade to, in case the script runs on top of a release preparation commit for which file VERSION has been updated to a value that does not yet contains a corresponding tag. This part of the script is actually OK, but not the way we call it in the IPsec workflow: we use "fetch-tags: true" but "fetch-depth: 0" (the default), and the two are not compatible, a shallow clone results in no tags being fetched. To address this, we retrieve the tag differently: instead of relying on "fetch-tags" from the workflow, we call "git fetch" from the script itself, provided the preconditions are met (we only run it from a Git repository, if the "origin" remote is defined). If the tag exists, either locally or remotely, then we can use it. Otherwise, the script considers that it runs from a release preparation Pull Request, and decrements the patch release number. The second issue is that we would return no value from the script if the patch release is zero. This is to avoid any attempt to find a previous patch release when working on a development branch. However, this logics is incorrect (it comes from a previous version of the script where we would always decrement the patch number). After the first release of a new minor version, it's fine to have a patch number at 0. What we should check instead is whether the version ends with "-dev". This commit brings additional changes for clarity: more comments, and a better separation between the "get latest patch release" and "get previous stable branch" cases, moving the relevant code to independent functions, plus better argument handling. We also edit the IPsec workflow to add some logs about the version retrieved. The logs should also display the script's error messages, if any, that are printed to stderr. Sample output from the script: VERSION Tag exists Prevous minor Previous patch release 1.14.3 Y v1.13 v1.14.3 1.14.1 Y v1.13 v1.14.1 1.14.0 Y v1.13 v1.14.0 1.14.1-dev N v1.13 <error> 1.15.0-dev N v1.14 <error> 1.13.90 N v1.12 v1.13.89 <- decremented 2.0.0 N <error> <error> 2.0.1 N <error> v2.0.0 <- decremented 2.1.1 N v2.0 v2.1.0 <- decremented [0] 56dfec2 ("contrib/scripts: Support patch releases in print-downgrade-version.sh") [1] 4d7902f ("contrib/scripts: Remove special handling for patch release number 90") [2] 5581963 ("ci/ipsec: Fix version retrieval for downgrades to closest patch release") [3] 3803f53 ("ci/ipsec: Fix downgrade version for release preparation commits") Fixes: 3803f53 ("ci/ipsec: Fix downgrade version for release preparation commits") Signed-off-by: Quentin Monnet <[email protected]> Signed-off-by: Yutaro Hayakawa <[email protected]>
Signed-off-by: renovate[bot] <[email protected]>
Signed-off-by: renovate[bot] <[email protected]>
Signed-off-by: renovate[bot] <[email protected]>
Signed-off-by: renovate[bot] <[email protected]>
Signed-off-by: Cilium Imagebot <[email protected]>
[ upstream commit 99786be ] [ backporter's notes: needed to resolve complexity issues in subsequent patches ] We first look up the destination endpoint to check for tunnel redirection, and then look it up a second time to access its sec_label and IPSec key. Make the first look-up unconditional, so that we can remove the second. Signed-off-by: Julian Wiedmann <[email protected]>
[ upstream commit ac63856 ] This is an alternative approach to fix cilium#21954, so that we can re-introduce the 2005 from-proxy routing rule in following patches to fix L7 proxy issues. This commit simply allows packets to WORLD as long as they are from ingress proxy. This was one of the solution suggested by Martynas, as recorded in commit message cilium/cilium@c534bb7: One fix was to extend the troublesome check https://github.com/cilium/cilium/blob/v1.12.3/bpf/bpf_host.c#L626 by allowing proxy replies to `WORLD_ID`. To tell if an skb is originated from ingress proxy, the commit extends the semantic of existing flags `TC_INDEX_F_SKIP_{INGRESS,EGRESS}_PROXY`, renames flags to clarify the changed meaning. Fixes: cilium#21954 (Reply from pod to outside is dropped when L7 ingress policy is used) Signed-off-by: Zhichuan Liang <[email protected]> Signed-off-by: Julian Wiedmann <[email protected]>
[ upstream commit 217ae4f ] [ backporter's notes: this is a custom backport to init.sh ] This commit re-introduced the 2005 routes that were removed by cilium@9dd6cfc (datapath: remove 2005 route table for ipv6 only) and cilium@c1a0dba (datapath: remove 2005 route table for ipv4 only). Signed-off-by: Robin Gögge <[email protected]> Signed-off-by: Zhichuan Liang <[email protected]> Signed-off-by: Julian Wiedmann <[email protected]>
[ upstream commit e8bed8d ] [ backporter notes: conflicts in unit tests. Had to change introduce the `NoError` check instead of expecting a specific error ] This commit is to add the same namespace while listing generated LB service, the main reason is to avoid wrong status update for gateways having the same name, but belonged to different namespace. Testing was done locally as per below: Before the fix: ``` $ kg gateway -A NAMESPACE NAME CLASS ADDRESS PROGRAMMED AGE another my-gateway cilium 10.110.222.237 True 4s default my-gateway cilium 10.110.222.237 True 56s ``` After the fix: ``` $ kg gateway -A NAMESPACE NAME CLASS ADDRESS PROGRAMMED AGE another my-gateway cilium 10.102.170.180 True 14m default my-gateway cilium 10.110.222.237 True 14m $ kg services -A NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE another cilium-gateway-my-gateway LoadBalancer 10.102.170.180 10.102.170.180 80:31424/TCP 15m default cilium-gateway-my-gateway LoadBalancer 10.110.222.237 10.110.222.237 80:31889/TCP 15m ... ``` Fixes: cilium#31270 Signed-off-by: Tam Mach <[email protected]> Signed-off-by: Sebastian Wicki <[email protected]>
[ upstream commit 0fb203e ] So that the failure of one matrix entry (e.g., caused by a flake) doesn't cancel the other ongoing tests, if any. Signed-off-by: Marco Iorio <[email protected]> Signed-off-by: Sebastian Wicki <[email protected]>
[ upstream commit b639eab ] In general, it is not recommended to carry several admin. operations on the cluster at the same time, as it can make troubleshooting in case of issues a lot more complicated. Mixing operations is also less likely to be covered in CI so more likely to hit corner cases. Performing IPsec key rotations during Cilium up/downgrades is one such case. Let's document it explicitly to discourage users from doing that. Signed-off-by: Paul Chaignon <[email protected]> Signed-off-by: Sebastian Wicki <[email protected]>
[ upstream commit 9939fa2 ] [ backporter notes: No SRV6 support on v1.13, I removed those test cases ] Before this patch, Hubble would wrongly report known traffic direction and reply status when IPSec was enabled. Signed-off-by: Alexandre Perrin <[email protected]> Signed-off-by: Sebastian Wicki <[email protected]>
Signed-off-by: renovate[bot] <[email protected]>
[ upstream commit fbe78c4 ] The default service CIDR of AKS clusters is 10.0.0.0/16 [1]. Unfortunately, we don't set a pod cidr for clusterpool IPAM, and hence use cilium's default of 10.0.0.0/8, which overlaps. This can lead to "fun" situations in which e.g. the kube-dns service ClusterIP is the same as the hubble-relay pod IP, or similar shenanigans. This usually breaks the cluster utterly. The fix is relatively straight-forward: set a pod CIDR for cilium which does not overlap with defaults of AKS. We chose 192.168.0.0/16 as this is what is recommended in [2]. [1]: https://learn.microsoft.com/en-us/azure/aks/configure-kubenet#create-an-aks-cluster-with-system-assigned-managed-identities [2]: https://learn.microsoft.com/en-us/azure/aks/azure-cni-powered-by-cilium#option-1-assign-ip-addresses-from-an-overlay-network Fixes: fbf3d38 (ci: add AKS workflow) Co-authored-by: Fabian Fischer <[email protected]> Signed-off-by: David Bimmler <[email protected]>
… port [ upstream commit d3b19d6 ] Currently, listing the load-balancing configuration doesn't display the L7LB Proxy Port for services of type `l7-load-balancer`. ``` cilium-dbg bpf lb list SERVICE ADDRESS BACKEND ADDRESS (REVNAT_ID) (SLOT) ... 10.96.193.7:443 0.0.0.0:0 (30) (0) [ClusterIP, non-routable, l7-load-balancer] ``` The only way of retrieving the L7LB proxy port is to list the frontends (`cilium-dbg bpf lb list --frontends`) and manually convert the backend id (union type) to the L7LB proxy port. Therefore, this commit addsd the L7LB proxy port to the output of `cilium-dbg bpf lb list` if the service is of type L7 LoadBalancer. The `--frontends` subcommand still displays the unmapped backend id. ``` cilium-dbg bpf lb list SERVICE ADDRESS BACKEND ADDRESS (REVNAT_ID) (SLOT) 10.96.0.1:443 172.18.0.3:6443 (1) (1) 0.0.0.0:0 (1) (0) [ClusterIP, non-routable] 10.96.252.10:443 172.18.0.2:4244 (22) (1) 0.0.0.0:0 (22) (0) [ClusterIP, InternalLocal, non-routable] 10.96.155.44:80 0.0.0.0:0 (14) (0) [ClusterIP, non-routable] 10.244.1.211:80 (14) (1) 172.18.0.2:32646 0.0.0.0:0 (33) (0) [NodePort, l7-load-balancer] (L7LB Proxy Port: 15735) 10.96.193.7:443 0.0.0.0:0 (30) (0) [ClusterIP, non-routable, l7-load-balancer] (L7LB Proxy Port: 15735) 10.96.122.45:80 10.244.1.250:80 (26) (1) 0.0.0.0:0 (26) (0) [ClusterIP, non-routable] 10.96.102.137:80 0.0.0.0:0 (23) (0) [ClusterIP, non-routable] 10.244.1.126:4245 (23) (1) 10.96.108.180:443 0.0.0.0:0 (17) (0) [ClusterIP, non-routable, l7-load-balancer] (L7LB Proxy Port: 17731) 172.18.255.1:80 0.0.0.0:0 (25) (0) [LoadBalancer, l7-load-balancer] (L7LB Proxy Port: 17731) 0.0.0.0:32646 0.0.0.0:0 (34) (0) [NodePort, non-routable, l7-load-balancer] (L7LB Proxy Port: 15735) 0.0.0.0:31012 0.0.0.0:0 (21) (0) [NodePort, non-routable, l7-load-balancer] (L7LB Proxy Port: 17731) ``` Signed-off-by: Marco Hofstetter <[email protected]> Signed-off-by: Tam Mach <[email protected]>
[ backporter's notes: Minor conflict due to structural changes in the GitHub Action (switched how we execute into LVH). ] The IPsec fixes will introduce a few XfrmInNoStates packet drops on up/downgrades due to non-atomic Linux APIs (can't replace XFRM states atomically). Those are limited to a very short time (time between two netlink syscalls). We however need to allowlist them in the CI. Since we're using the conn-disrupt GitHub action from main, we need to allowlist in main for the pull request's CI to pass. Note that despite the expected-xfrm-errors flag, the tests will still fail if we get 10 or more such drops. We don't expect so many XfrmInNoStates drops so we still want to fail in that case. Signed-off-by: Paul Chaignon <[email protected]>
Signed-off-by: Paul Chaignon <[email protected]>
[ backporter's notes: Small conflicts due to the new error handling in main. ] This small bit of refactoring will ease a subsequent commit. Signed-off-by: Paul Chaignon <[email protected]>
[ backporter's comments: Trivial conflict in conformance-ipsec-e2e.yaml for the number of expected keys. ] We need to ensure the (key used, sequence number) tuple for each encrypted packet is always unique on the network. Today that's not the case because the key is the same for all nodes and the sequence number starts at 0 on node reboot. To enable this, we will derive one key per node pair from a global key shared across all nodes. We need it per node pair and not per node because the first message emitted from A to B shouldn't be using the same key as the first message emitted from B to A, to satisfy the above requirement. To that end, for each node pair (A, B), we compute a key as follows: key = sha256(global_key + ip_of_a + ip_of_b) The sha256 sum is then truncated to the expected length. Once computed, we install the derived keys such that the key used for encryption on node A is the same as the key used for decryption on node B: Node A <> Node B XFRM IN: key(b+a) XFRM IN: key(a+b) XFRM OUT: key(a+b) XFRM OUT: key(b+a) Signed-off-by: Paul Chaignon <[email protected]>
In the previous commit, we changed the way we compute the IPsec keys. We therefore need to replace the XFRM states to use the new keys. Our current update logic however doesn't take this case into account. It compares states based on IPs, marks, and SPIs, but it doesn't compare the keys. It would therefore assume that the correct states are already installed. This commit extends that logic to detect a difference in encryption keys and, if such a difference exist, remove the old states. Signed-off-by: Paul Chaignon <[email protected]>
[ backporter's comments: Many conflicts. Some trivial due to new files with logs in sources_name_to_ids.h. Some minor in ipsec_linux.go due to the new error handling in main. Remove unnecessary include of lib/ipcache.h in BPF unit tests; that file doesn't exist before v1.15. ] It turns out that two XFRM IN states can't have the same mark and destination IP, even if they have different source IPs. That's an issue in our case because each node1->node2 pair will have its own IPsec key. Therefore, we need one XFRM state per origin node on input. Since we can't differentiate those XFRM states by their source IPs, we will have to differentiate using the marks. To do so, we need to convert the source IP into a packet mark before matching against XFRM states. We can write these packet marks in bpf_network, before going up the stack for decryption. And conveniently, we've just introduce a way to convert each cluster node into an ID, the node ID, which fits in the packet mark. This commit therefore performs an node ID map lookup to retrieve the node ID using the outer source IP address when packets are first processed in bpf_network. We clear the node ID from the packet mark after decryption using XFRM (output-mark). If no node ID is found for the outer source IP, we drop the packet. It seems preferable to drop it from BPF with all the contextual information rather than let it proceed to the XFRM layer where it will be dropped with only an error code incrementing. Signed-off-by: Paul Chaignon <[email protected]>
[ backporter's comments: Many conflicts due to the new error handling in main. ] We want to have one IPsec key per node1->node2 (not including node2->node1 which will get a different key). We therefore need per-node XFRM states on the receive/decrypt side to carry each node's key. This commit implements that change. Thus, instead of creating a unique XFRM IN state when we receive the local node, we will create an XFRM IN state everytime we receive a remote node. Signed-off-by: Paul Chaignon <[email protected]>
[ backporter's comments: Minor conflict due to new error handling in main. ] This commit extends the logic from commit c0d9b8c ("ipsec: Allow old and new XFRM OUT states to coexist for upgrade") to have both the old and the new XFRM IN states in place. This is necessary to avoid packet drops during the upgrade. As with the XFRM OUT states, we can't add the new IN state while the old one is in place. We therefore need to first remove the old state, to then add the new one. See c0d9b8c ("ipsec: Allow old and new XFRM OUT states to coexist for upgrade") for details. Note this commit also removes the comparison of output-marks. Output-marks aren't actually used by the kernel to decide if two states conflict. And in the case of XFRM IN states, the output-marks changed a bit as well. Despite being different, the states still conflict. Signed-off-by: Paul Chaignon <[email protected]>
[ backporter's comments: Non-trivial conflicts due to refactoring around local node initialization logic. ] Read and export the local bootid via CiliumNode. We'll need it in a subsequent commit to generate new IPsec keys when a node reboots. This commit also collects the boot_id file as part of the bugtool report. Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: Paul Chaignon <[email protected]> Co-authored-by: Paul Chaignon <[email protected]>
[ backporter's notes: Minor conflicts due to new error handling in main. ] We need to ensure we never have two packets encrypted with the same key and sequence number. To that end, in previous commits, we introduced per-node-pair keys. That is however not sufficient. Since the sequence numbers can start from zero on node boot, if a node reboot, it will start sending traffic encrypted again with the same key and sequence number as it did before. To fix that, we need to ensure the per-node-pair keys change on node reboots. We achieve that by using the boot ID in the per-node-pair key calculation. For a pair of nodes A and B with IP addresses a and b and boot IDs x and y, we will therefore install two different keys: Node A <> Node B XFRM IN: key(b+a+y+x) XFRM IN: key(a+b+x+y) XFRM OUT: key(a+b+x+y) XFRM OUT: key(b+a+y+x) This is done such that, for each pair of nodes A, B, the key used for decryption on A (XFRM IN) is the same key used for encryption on B (XFRM OUT), and vice versa. Since we are now retrieving the local node's boot ID as part of the IPsec code, we need to initialize the mocked local node store in the unit tests. Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: Paul Chaignon <[email protected]> Co-authored-by: Paul Chaignon <[email protected]>
[ backporter's notes: Many conflicts due to the new error handling in main. ] When we detect that a node's bootid has changed, we need to update the IPsec states. Unfortunately this is not as straightforward as it should be, because we may receive the new boot ID before a CiliumInternalIP is assign to the node. In such a case, we can't install the XFRM states yet because we don't have the CiliumInternalIP, but we need to remember that the boot ID changed and states should be replaced. We therefore record that information in a map, ipsecUpdateNeeded, which is later read to see if the boot ID changed. Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: Paul Chaignon <[email protected]> Co-authored-by: Paul Chaignon <[email protected]>
[ backporter's notes: Minor conflict due to a new error being listed in text that is now removed. ] When a node reboots the key used to communicate with it is expected to change due to the new boot id generated. While the new key is being installed we may need to do it non-atomically (delete + insert), so packets to/from that node might be dropped which would cause increases in the XfrmNoStatesIn/Out. Add a note about it in the docs so users are not surprised. Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: Paul Chaignon <[email protected]> Co-authored-by: Paul Chaignon <[email protected]>
Now we can enable ESN anti-replay with window size of 1024. If a node reboots then everyone updates the related keys with the new one due to the different bootid, the node itself is already generating the keys with the new bootid. The window is used to allow for out-of-order packets, anti-replay still doesn't allow to replay any packet but keeps a bitmap and can accept out-of-order packets within window size range. For more information check section ""A2. Anti-Replay Window" of RFC 4303. Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: Paul Chaignon <[email protected]> Co-authored-by: Paul Chaignon <[email protected]>
[ backporter's notes: Minor conflicts in conformance-ipsec-e2e.yaml due to the extension of CI to cover non-AEAD keys. Minor conflict in the documentation for how to retrieve the SPI from the secret. The testing logic for the number of IPsec keys to expect also needed to be adjusted. On v1.13, we have one case where IPv6 is disabled. In that case, the number of expected keys per remote node drops from two to one (i.e., one per IP family). ] The ESN bit in the IPsec secret will be used to indicate whether per-node-pair keys should be used or if the global key should remain in use. Specifically, it consist in a '+' sign after the SPI number in the secret. This ESN bit will be used to transition from a global key system to a per-node-pair system at runtime. We would typically rely on an agent flag for such a configuration. However, in this case, we need to perform a key rotation at the same time as we change the key system. Encoding the key system in the IPsec secret achieves that. By transition from the global to the per-node-pair keys via a key rotation, we ensure that the two can coexist during the transition. The old, global key will have XFRM rules with SPI n, whereas the new, per-node-pair keys will have XFRM rules with SPI n+1. Using a bit in the IPsec secret is also easier to test because we already have all the logic to test key rotation (whereas we would need new logic to test a flag change). The users therefore need to perform a key rotation from e.g.: 3 rfc4106(gcm(aes)) [...] 128 to: 4+ rfc4106(gcm(aes)) [...] 128 The key rotation test in CI is updated to cover a rotation from 3 to 4+ (meaning a rotation into the new per-node-pair key system). Signed-off-by: Paul Chaignon <[email protected]>
[ backporter's notes: major conflict because we have a special case on v1.13 to handle the case where IPv6 is disabled. We therefore need to account for both IPv6 and the key-system when computing the expected number of keys. ] Since commit 4cf468b ("ipsec: Control use of per-node-pair keys from secret bit"), IPsec key rotations can be used to switch from the single-key system to the per-tunnel key system (also referred to as per-node-pair key system). Our key rotation test in CI was updated to cover such a switch. This commit extends it to also cover traditional key rotations, with both the new and old key systems. The switch back into a single-key system is also covered. These special key rotations are controlled with a single + sign. Adding it after the SPI in the IPsec Kubernetes secret is enough to switch to a per-tunnel key system. We thus simply need to cover all 4 cases of having or not having the + sign in the old and new secrets. Signed-off-by: Paul Chaignon <[email protected]>
When adding the BootID field to the CiliumNode CRD we forgot to bump the version, which is an issue when after an cilium upgrade the operator tries to update the CiliumNode objects to include the BootID field. Signed-off-by: Robin Gögge <[email protected]>
[ notes: before v1.15, enableIPsecIPv4 and enableIPsecIPv6 didn't return an error. This needed to be changed from the original commit ] A node update that doesn't contain a BootID will cause the creation of non-matching XFRM IN and OUT states across the cluster as the BootID is used to generate per-node key pairs. Non-matching XFRM states will result in XfrmInStateProtoError, causing packet drops. An empty BootID should thus be treated as an error, and Cilium should not attempt to derive per-node keys from it. Signed-off-by: Robin Gögge <[email protected]>
This commit ensures that - each time we compute a per-node-pair-key we create an empty slice with the correct length first, and then append all the input data instead of appending to one of the input slices (`globalKey`) directly. - the IPs that are used as arguments in `computeNodeIPsecKey` are canonical, meaning IPv4 IPs consist of 4 bytes and IPv6 IPs consist of 16 bytes. This is necessary to always have the same inputs on all nodes when computing the per-node-pair-key. Without this IPs might not match on the byte level, e.g on one node the input is a v6 mapped v4 address (IPv4 address in 16 bytes) and on the other it isn't when used as input to the hash function. This will generate non-matching keys. Co-authored-by: Zhichuan Liang <[email protected]> Signed-off-by: Robin Gögge <[email protected]>
We have very little logging of the boot IDs. Really fixing that will require a bit of work to not be too verbose, but in the meantime, we should at least log the local boot ID. Signed-off-by: Paul Chaignon <[email protected]>
Signed-off-by: Tim Horner <[email protected]>
Release v1.13.14
jaredledvina
had a problem deploying
to
release-base-images
March 27, 2024 18:30 — with
GitHub Actions
Failure
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Merges in the upstream v1.13.14 tag into our fork
1 conflict as expected with
images/cilium/Dockerfile
due to incompatible changes with the CILIUM_ENVOY image ref change