Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
622b392
start with documentation
wind57 Oct 15, 2025
26d6b77
wip
wind57 Oct 15, 2025
6a31cac
Merge branch 'main' into fabric8-native-leader-election
wind57 Oct 15, 2025
7345421
started work
wind57 Oct 16, 2025
49b4499
before tests in fabric8
wind57 Oct 21, 2025
bcb53d9
Merge branch 'main' into fabric8-native-leader-election
wind57 Oct 21, 2025
61e6fa2
Merge branch 'main' into fabric8-native-leader-election
wind57 Oct 22, 2025
d55c4b6
wip
wind57 Oct 22, 2025
172f953
Merge branch 'main' into fabric8-native-leader-election
wind57 Oct 22, 2025
723a1f7
wip
wind57 Oct 22, 2025
cd1be17
wip
wind57 Oct 23, 2025
f3ee5d8
Merge branch 'main' into fabric8-native-leader-election
wind57 Oct 23, 2025
6c8db4c
wip
wind57 Oct 23, 2025
933b7d2
wip
wind57 Oct 29, 2025
780573e
merge main
wind57 Oct 30, 2025
dc56d26
wip
wind57 Oct 30, 2025
5a14d6b
Merge branch 'main' into fabric8-native-leader-election
wind57 Nov 1, 2025
7dbaf02
Merge branch 'fix-2087' into fabric8-native-leader-election
wind57 Nov 1, 2025
ed5321f
fix tests
wind57 Nov 1, 2025
fdccea5
minor refactor
wind57 Nov 1, 2025
597ac5c
simplify
wind57 Nov 1, 2025
3d4066d
checkstyle
wind57 Nov 2, 2025
4822f60
fix tests
wind57 Nov 2, 2025
43d4d3f
Merge branch 'main' into fabric8-native-leader-election
wind57 Nov 11, 2025
5b51bbd
more changes
wind57 Nov 12, 2025
1744873
Merge branch 'main' into fabric8-native-leader-election
wind57 Nov 12, 2025
6c06fe8
wip
wind57 Nov 18, 2025
2060f0a
Merge branch 'main' into fabric8-native-leader-election
wind57 Nov 18, 2025
b8e7d18
added tests
wind57 Dec 9, 2025
945b17d
Merge branch 'main' into fabric8-native-leader-election
wind57 Dec 9, 2025
c856aa4
added tests
wind57 Dec 9, 2025
4bec47a
trigger
wind57 Dec 9, 2025
f559097
fix tests
wind57 Dec 9, 2025
9e6988f
fix tests
wind57 Dec 9, 2025
212b080
trigger
wind57 Dec 10, 2025
e41997e
fix tests
wind57 Dec 10, 2025
587c9ae
fix tests
wind57 Dec 10, 2025
621339d
fix tests
wind57 Dec 10, 2025
b971468
fix tests
wind57 Dec 11, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
84 changes: 84 additions & 0 deletions docs/modules/ROOT/pages/leader-election.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -50,4 +50,88 @@ to `false` in `application.[properties | yaml]`:
[source,properties]
----
management.info.leader.enabled=false

'''

There is another way you can configure leader election, and it comes with native support in the fabric8 library (k8s native client support is not yet implemented). In the long run, this will be the default way to configure leader election, while the previous one will be dropped. You can treat this one much like the JDKs "preview" features.

To be able to use it, you need to set the property:

[source]
----
spring.cloud.kubernetes.leader.election.enabled=true
----

Unlike the old implementation, this one will use either the `Lease` _or_ `ConfigMap` as the lock, depending on your cluster version. You can force using configMap still, even if leases are supported, via :

[source]
----
spring.cloud.kubernetes.leader.election.use-config-map-as-lock=true
----

The name of that `Lease` or `ConfigMap` can be defined using the property (default value is `spring-k8s-leader-election-lock`):

[source]
----
spring.cloud.kubernetes.leader.election.lockName=other-name
----

The namespace where the lock is created (`default` being set if no explicit one exists) can be set also:

[source]
----
spring.cloud.kubernetes.leader.election.lockNamespace=other-namespace
----

Before the leader election process kicks in, you can wait until the pod is ready (via the readiness check). This is enabled by default, but you can disable it if needed:

[source]
----
spring.cloud.kubernetes.leader.election.waitForPodReady=false
----

Like with the old implementation, we will publish events by default, but this can be disabled:

[source]
----
spring.cloud.kubernetes.leader.election.publishEvents=false
----

There are a few parameters that control how the leader election process will happen. To explain them, we need to look at the high-level implementation of this process. All the candidate pods try to become the leader, or they try to _acquire_ the lock. If the lock is already taken, they will continue to retry to acquire it every `spring.cloud.kubernetes.leader.election.retryPeriod` (value is specified as `java.time.Duration`, and by default it is 2 seconds).

If the lock is not taken, current pod becomes the leader. It does so by inserting a so-called "record" into the lock (`Lease` or `ConfigMap`). Among the things that the "record" contains, is the `leaseDuration` (that you can specify via `spring.cloud.kubernetes.leader.election.leaseDuration`; by default it is 15 seconds and is of type `java.time.Duration`). This acts like a TTL on the lock: no other candidate can acquire the lock, unless this period has expired (from the last renewal time).

Once a certain pod establishes itself as the leader (by acquiring the lock), it will continuously (every `spring.cloud.kubernetes.leader.election.retryPeriod`) try to renew its lease, or in other words: it will try to extend its leadership. When a renewal happens, the "record" that is stored inside the lock, is updated. For example, `renewTime` is updated inside the record, to denote when the last renewal happened. (You can always peek inside these fields by using `kubectl describe lease...` for example).

Renewal must happen within a certain interval, specified by `spring.cloud.kubernetes.leader.election.renewDeadline`. By default, it is equal to 10 seconds, and it means that the leader pod has a maximum of 10 seconds to renew its leadership. If that does not happen, this pod loses its leadership and leader election starts again. Because other pods try to become leaders every 2 seconds (by default), it could mean that the pod that just lost leadership, will become leader again. If you want other pods to have a higher chance of becoming leaders, you can set the property (specified in seconds, by default it is 0) :

[source]
----
spring.cloud.kubernetes.leader.election.wait-after-renewal-failure=3
----

This will mean that the pod (that could not renew its lease) and lost leadership, will wait this many seconds, before trying to become leader again.

Let's try to explain these settings based on an example: there are two pods that participate in leader election. For simplicity let's call them `podA` and `podB`. They both start at the same time: `12:00:00`, but `podA` establishes itself as the leader. This means that every two seconds (`retryPeriod`), `podB` will try to become the new leader. So at `12:00:02`, then at `12:00:04` and so on, it will basically ask : "Can I become the leader?". In our simplified example, the answer to that question can be answered based on `podA` activity.

After `podA` has become the leader, at every 2 seconds, it will try to "extend" or _renew_ its leadership. So at `12:00:02`, then at `12:00:04` and so on, `podA` goes to the lock and updates its record to reflect that it is still the leader. Between the last successful renewal and the next one, it has exactly 10 seconds (`renewalDeadline`). If it fails to renew its leadership (there is a connection problem or a big GC pause, etc.) within those 10 seconds, it stops leading and `podB` can acquire the leadership now. When `podA` stops being a leader in a graceful way, the lock record is "cleared", basically meaning that `podB` can acquire leadership immediately.

A different story happens when `podA` dies with an OutOfMemory for example, without being able to gracefully update lock record and this is when `leaseDuration` argument matters. The easiest way to explain is via an example:

`podA` has renewed its leadership at `12:00:04`, but at `12:00:05` it has been killed by the OOMKiller. At `12:00:06`, `podB` will try to become the leader. It will check if "now" (`12:00:06`) is _after_ last renewal + lease duration, essentially it will check:

[source]
----
12:00:06 > (12:00:04 + 00:00:10)
----

The condition is not fulfilled, so it can't become the leader. Same result will be at `12:00:08`, `12:00:10` and so on, until `12:00:16` and this is where the TTL (`leaseDuration`) of the lock will expire and `podB` can acquire it. As such, a lower value of `leaseDuration` will mean a faster acquiring of leadership by other pods.

You might have to give proper RBAC to be able to use this functionality, for example:

[source]
----
- apiGroups: [ "coordination.k8s.io" ]
resources: [ "leases", "configmaps" ]
verbs: [ "get", "update", "create", "patch"]
----
Original file line number Diff line number Diff line change
Expand Up @@ -16,25 +16,84 @@

package org.springframework.cloud.kubernetes.commons.leader;

import java.io.File;
import java.io.IOException;
import java.io.UncheckedIOException;
import java.net.InetAddress;
import java.net.UnknownHostException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.Optional;
import java.util.concurrent.locks.ReentrantLock;

import org.springframework.cloud.kubernetes.commons.EnvReader;
import org.springframework.core.log.LogAccessor;
import org.springframework.util.StringUtils;

import static org.springframework.cloud.kubernetes.commons.KubernetesClientProperties.SERVICE_ACCOUNT_NAMESPACE_PATH;

/**
* @author wind57
*/
public final class LeaderUtils {

/**
* Coordination group for leader election.
*/
public static final String COORDINATION_GROUP = "coordination.k8s.io";

/**
* Coordination version for leader election.
*/
public static final String COORDINATION_VERSION = "v1";

/**
* Lease constant.
*/
public static final String LEASE = "Lease";

/**
* Prefix for all properties related to leader election.
*/
public static final String LEADER_ELECTION_PROPERTY_PREFIX = "spring.cloud.kubernetes.leader.election";

/**
* Property that controls whether leader election is enabled.
*/
public static final String LEADER_ELECTION_ENABLED_PROPERTY = LEADER_ELECTION_PROPERTY_PREFIX + ".enabled";

private static final String POD_NAMESPACE = "POD_NAMESPACE";

private static final LogAccessor LOG = new LogAccessor(LeaderUtils.class);

// k8s environment variable responsible for host name
private static final String HOSTNAME = "HOSTNAME";

private LeaderUtils() {

}

/**
* ideally, should always be present. If not, downward api must enable this one.
*/
public static Optional<String> podNamespace() {
Path serviceAccountPath = new File(SERVICE_ACCOUNT_NAMESPACE_PATH).toPath();
boolean serviceAccountNamespaceExists = Files.isRegularFile(serviceAccountPath);
if (serviceAccountNamespaceExists) {
try {
String namespace = new String(Files.readAllBytes(serviceAccountPath)).replace(System.lineSeparator(),
"");
LOG.info(() -> "read namespace : " + namespace + " from service account " + serviceAccountPath);
return Optional.of(namespace);
}
catch (IOException e) {
throw new UncheckedIOException(e);
}

}
return Optional.ofNullable(EnvReader.getEnv(POD_NAMESPACE));
}

public static String hostName() throws UnknownHostException {
String hostName = EnvReader.getEnv(HOSTNAME);
if (StringUtils.hasText(hostName)) {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
/*
* Copyright 2013-present the original author or authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* https://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.springframework.cloud.kubernetes.commons.leader.election;

import java.util.concurrent.Executors;
import java.util.concurrent.ScheduledFuture;
import java.util.concurrent.ScheduledThreadPoolExecutor;
import java.util.concurrent.ThreadFactory;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.locks.ReentrantLock;

import jakarta.annotation.Nonnull;
import org.apache.commons.logging.LogFactory;

import org.springframework.core.log.LogAccessor;

/**
* This is taken from fabric8 with some changes (we need it, so it could be placed in the
* common package). A single thread scheduler that will shutdown itself when there are no
* more jobs running inside it. When all ScheduledFuture::cancel are called, the queue of
* tasks will be empty and there is an internal runnable that checks that.
*
* @author wind57
*/
public final class CachedSingleThreadScheduler {

private static final LogAccessor LOG = new LogAccessor(LogFactory.getLog(CachedSingleThreadScheduler.class));

private final ReentrantLock lock = new ReentrantLock();

private final long ttlMillis;

private final String name;

private ScheduledThreadPoolExecutor executor;

public CachedSingleThreadScheduler(String name, long ttlMillis) {
this.ttlMillis = ttlMillis;
this.name = name;
}

public ScheduledFuture<?> scheduleWithFixedDelay(Runnable command, long initialDelay, long delay, TimeUnit unit) {
try {
lock.lock();
this.startExecutor();
LOG.debug(() -> "Scheduling command to run in : " + name);
return this.executor.scheduleWithFixedDelay(command, initialDelay, delay, unit);
}
finally {
lock.unlock();
}
}

public ScheduledFuture<?> schedule(Runnable command, long delay, TimeUnit unit) {
try {
lock.lock();
this.startExecutor();
LOG.debug(() -> "Scheduling command to run in : " + name);
return this.executor.schedule(command, delay, unit);
}
finally {
lock.unlock();
}
}

private void startExecutor() {
if (this.executor == null) {
this.executor = new ScheduledThreadPoolExecutor(1, threadFactory());
this.executor.setRemoveOnCancelPolicy(true);
this.executor.scheduleWithFixedDelay(this::shutdownCheck, this.ttlMillis, this.ttlMillis,
TimeUnit.MILLISECONDS);
}

}

private void shutdownCheck() {
try {
lock.lock();
if (this.executor.getQueue().isEmpty()) {
LOG.debug(() -> "Shutting down executor : " + name);
this.executor.shutdownNow();
this.executor = null;
}
}
finally {
lock.unlock();
}

}

private ThreadFactory threadFactory() {
return new ThreadFactory() {
final ThreadFactory threadFactory = Executors.defaultThreadFactory();

@Override
public Thread newThread(@Nonnull Runnable runnable) {
Thread thread = threadFactory.newThread(runnable);
thread.setName("cached-single-thread-scheduler" + "-" + thread.getName());
thread.setDaemon(true);
return thread;
}
};
}

}
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
/*
* Copyright 2013-present the original author or authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* https://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.springframework.cloud.kubernetes.commons.leader.election;

import java.lang.annotation.Documented;
import java.lang.annotation.ElementType;
import java.lang.annotation.Inherited;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;

import org.springframework.boot.autoconfigure.condition.NoneNestedConditions;
import org.springframework.context.annotation.Conditional;

/**
* @author wind57
*/
@Target({ ElementType.TYPE, ElementType.METHOD })
@Retention(RetentionPolicy.RUNTIME)
@Documented
@Inherited
@Conditional(ConditionalOnLeaderElectionDisabled.OnLeaderElectionDisabled.class)
public @interface ConditionalOnLeaderElectionDisabled {

class OnLeaderElectionDisabled extends NoneNestedConditions {

OnLeaderElectionDisabled() {
super(ConfigurationPhase.REGISTER_BEAN);
}

@ConditionalOnLeaderElectionEnabled
static class OnLeaderElectionDisabledClass {

}

}

}
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
/*
* Copyright 2013-present the original author or authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* https://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.springframework.cloud.kubernetes.commons.leader.election;

import java.lang.annotation.Documented;
import java.lang.annotation.ElementType;
import java.lang.annotation.Inherited;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;

import org.springframework.boot.autoconfigure.condition.ConditionalOnProperty;

import static org.springframework.cloud.kubernetes.commons.leader.LeaderUtils.LEADER_ELECTION_ENABLED_PROPERTY;

/**
* Provides a more succinct conditional for:
* <code>spring.cloud.kubernetes.leader.election.enabled</code>.
*
* @author wind57
*/
@Target({ ElementType.TYPE, ElementType.METHOD })
@Retention(RetentionPolicy.RUNTIME)
@Documented
@Inherited
@ConditionalOnProperty(value = LEADER_ELECTION_ENABLED_PROPERTY, havingValue = "true", matchIfMissing = false)
public @interface ConditionalOnLeaderElectionEnabled {

}
Loading
Loading