Close per-namespace pod informers on agent termination#2821
Open
vquemener wants to merge 1 commit intojenkinsci:masterfrom
Open
Close per-namespace pod informers on agent termination#2821vquemener wants to merge 1 commit intojenkinsci:masterfrom
vquemener wants to merge 1 commit intojenkinsci:masterfrom
Conversation
registerPodInformer() creates a SharedIndexInformer per namespace but never closes or removes them. With ephemeral namespaces (one per build), each build leaks an informer that retries indefinitely after the namespace is deleted, causing thread leaks and log floods (403 Forbidden). Add unregisterPodInformer(namespace) in KubernetesCloud and call it from KubernetesSlave._terminate() when no other pod from the same cloud remains in the namespace. Tests added: - unregisterPodInformer closes and removes the informer - no-op on unknown namespace - informer kept while other pods share the namespace - informer closed when last pod in namespace terminates - informer not affected by other clouds Co-authored-by: Claude Opus 4.6 (Anthropic) <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
registerPodInformer()creates aSharedIndexInformer<Pod>per namespace and stores it in aConcurrentHashMap, but nothing ever closes or removes these informers. When pods run in ephemeral namespaces (one per build), each buildleaks an informer that retries indefinitely after the namespace is deleted, causing thread leaks, log floods (
403 Forbidden), and CPU waste.This PR adds
unregisterPodInformer(namespace)inKubernetesCloudand calls it fromKubernetesSlave._terminate()when no other pod from the same cloud remains in the namespace.Fixes #2820
AI disclosure
This patch was developed with the assistance of Claude Opus 4.6 (Anthropic).
The analysis, fix, and tests were produced collaboratively between a human operator and the AI. I am not a fluent Java developer: I maintain the Jenkins instance where this bug was causing real production issues, and this was the best way I could contribute a concrete fix proposal.
I completely understand if this PR is rejected on that basis, or if the approach needs rework by someone more familiar with the codebase. I wanted to at least document the problem and push a starting point for discussion.
Changes
KubernetesCloud.javaunregisterPodInformer(String namespace): removes the informer from the map and callsinformer.close().KubernetesSlave.java_terminate(), after pod deletion: checks whether any otherKubernetesSlavenode from the same cloud still uses the namespace. If not, callscloud.unregisterPodInformer(ns).KubernetesCloudTest.javaunregisterPodInformer_closesAndRemoves: verifies the informer is closed and removed from the map.unregisterPodInformer_noopOnUnknownNamespace: verifies no side effects when called with an unknown namespace.informerKeptWhileOtherPodsShareNamespace: two pods in the same namespace => terminating the first must not close the shared informer.informerClosedWhenLastPodInNamespaceTerminates: last pod in namespace terminates => informer must be closed.informerNotAffectedByOtherCloud: pods on different clouds sharing a namespace => only the relevant cloud's informer is closed.Testing done
KubernetesCloudTest, 0 failuresRegistered informeron launch andClosed informeron agent terminationSubmitter checklist