Fix overdue state doesn't account for timeperiods in HA cluster #10562

yhabteab · 2025-09-22T13:36:47Z

So, after the long discussions in #10082 that didn't lead to a final conclusion, here's a minimal implementation without too many refactorings and changes to existing logic. This PR changes the checker queueing behaviour a bit to never reschedule checkables with ongoing checks. Thus, we avoid piling up checkables with running checks in the queue that will never be executed anyway. Though, since this change affects a critical part of Icinga 2, we should be very careful with this not to introduce any regressions or performance degradations. Thus far, I've let it run some checks with 500ms intervals and I didn't notice any issues (see the attached graph), though the graph shows at "10:35" a small hiccup caused by some API requests triggered by me. Another, behaviour change is that previously the checker didn't trigger the now dropped Checkable::OnNextCheckUpdated signal when active checks were disabled. However, since the Icinga DB SendNextUpdate function already has a logic to send a ZREM Redis command when active checks are disabled, the checker now also triggers the Checkable::OnNextCheckChanged signal unconditionally.

Now, to the individual changes, each commit is self explaining what it does and why it's needed and contains some more details about the reasoning behind it. So, please have a look at the individual commits.

Tests

Just like before a simple /v1/actions/reschedule-check request triggers two icinga:nextupdate:host events.

docker exec -i redis redis-cli -p 6380 -n 12 monitor
OK
----
1758547635.252940 [12 172.18.0.1:57800] "HSET" "icinga:host:state" "4c80e4ad47034480b3c3bb28885970b6bb50983e" "{\"affects_children\":false,\"check_attempt\":1,\"check_source\":\"mbp-yhabteab.fritz.box\",\"check_timeout\":60000,\"environment_id\":\"30e34a7a6c432baf29e051ed428aee3825f25e67\",\"execution_time\":0,\"hard_state\":0,\"host_id\":\"4c80e4ad47034480b3c3bb28885970b6bb50983e\",\"id\":\"4c80e4ad47034480b3c3bb28885970b6bb50983e\",\"in_downtime\":false,\"is_acknowledged\":false,\"is_active\":true,\"is_flapping\":false,\"is_handled\":false,\"is_problem\":true,\"is_reachable\":true,\"is_sticky_acknowledgement\":false,\"last_state_change\":1758545183248,\"last_update\":1758545183248,\"latency\":0,\"next_check\":1758547635218,\"next_update\":1758547695218,\"normalized_performance_data\":\"wrong=85;0;0;0 ok=1;0;0;0 max=215.600000;0;0;0\",\"output\":\" /dev/dm-13: 1.0MiB/s read1, 4.3KiB/s write1, 1.0MiB/s total, 215.6MiB/s max\",\"performance_data\":\"wrong=85;0;0;0; ok=1;0;0;0; max=215.6;0;0;0;\",\"previous_hard_state\":99,\"previous_soft_state\":0,\"scheduling_source\":\"mbp-yhabteab.fritz.box\",\"severity\":2080,\"soft_state\":1,\"state_type\":\"soft\"}"
1758547635.252982 [12 172.18.0.1:57800] "HSET" "icinga:checksum:host:state" "4c80e4ad47034480b3c3bb28885970b6bb50983e" "{\"checksum\":\"b4f84390d527074ed45087fc3ca7bf581dda9522\"}"
1758547635.253002 [12 172.18.0.1:57800] "ZADD" "icinga:nextupdate:host" "1758547695.218023" "4c80e4ad47034480b3c3bb28885970b6bb50983e"
1758547635.255312 [12 172.18.0.1:57800] "HSET" "icinga:host:state" "4c80e4ad47034480b3c3bb28885970b6bb50983e" "{\"affects_children\":false,\"check_attempt\":1,\"check_commandline\":\"'dummy'\",\"check_source\":\"mbp-yhabteab.fritz.box\",\"check_timeout\":60000,\"environment_id\":\"30e34a7a6c432baf29e051ed428aee3825f25e67\",\"execution_time\":0,\"hard_state\":0,\"host_id\":\"4c80e4ad47034480b3c3bb28885970b6bb50983e\",\"id\":\"4c80e4ad47034480b3c3bb28885970b6bb50983e\",\"in_downtime\":false,\"is_acknowledged\":false,\"is_active\":true,\"is_flapping\":false,\"is_handled\":false,\"is_problem\":false,\"is_reachable\":true,\"is_sticky_acknowledgement\":false,\"last_state_change\":1758547635219,\"last_update\":1758547635219,\"latency\":1,\"next_check\":1758547930999,\"next_update\":1758548231003,\"output\":\"Check was successful.\",\"previous_hard_state\":0,\"previous_soft_state\":1,\"scheduling_source\":\"mbp-yhabteab.fritz.box\",\"severity\":0,\"soft_state\":0,\"state_type\":\"hard\"}"
1758547635.255358 [12 172.18.0.1:57800] "HSET" "icinga:checksum:host:state" "4c80e4ad47034480b3c3bb28885970b6bb50983e" "{\"checksum\":\"23738343385294cead94b823a79916c6620dd3b4\"}"
1758547635.255374 [12 172.18.0.1:57800] "ZADD" "icinga:nextupdate:host" "1758548231.003308" "4c80e4ad47034480b3c3bb28885970b6bb50983e"
1758547635.255827 [12 172.18.0.1:57800] "XADD" "icinga:runtime:state" "MAXLEN" "~" "1000000" "*" "runtime_type" "upsert" "redis_key" "icinga:host:state" "checksum" "23738343385294cead94b823a79916c6620dd3b4" "affects_children" "0" "check_attempt" "1" "check_commandline" "'dummy'" "check_source" "mbp-yhabteab.fritz.box" "check_timeout" "60000" "environment_id" "30e34a7a6c432baf29e051ed428aee3825f25e67" "execution_time" "0" "hard_state" "0" "host_id" "4c80e4ad47034480b3c3bb28885970b6bb50983e" "id" "4c80e4ad47034480b3c3bb28885970b6bb50983e" "in_downtime" "0" "is_acknowledged" "0" "is_active" "1" "is_flapping" "0" "is_handled" "0" "is_problem" "0" "is_reachable" "1" "is_sticky_acknowledgement" "0" "last_state_change" "1758547635219" "last_update" "1758547635219" "latency" "1" "next_check" "1758547930999" "next_update" "1758548231003" "output" "Check was successful." "previous_hard_state" "0" "previous_soft_state" "1" "scheduling_source" "mbp-yhabteab.fritz.box" "severity" "0" "soft_state" "0" "state_type" "hard"

... while a /v1/actions/process-check-result only triggers a single icinga:nextupdate:event.

1758547833.559175 [12 172.18.0.1:57800] "HSET" "icinga:host:state" "4c80e4ad47034480b3c3bb28885970b6bb50983e" "{\"affects_children\":false,\"check_attempt\":1,\"check_source\":\"mbp-yhabteab.fritz.box\",\"check_timeout\":60000,\"environment_id\":\"30e34a7a6c432baf29e051ed428aee3825f25e67\",\"execution_time\":0,\"hard_state\":0,\"host_id\":\"4c80e4ad47034480b3c3bb28885970b6bb50983e\",\"id\":\"4c80e4ad47034480b3c3bb28885970b6bb50983e\",\"in_downtime\":false,\"is_acknowledged\":false,\"is_active\":true,\"is_flapping\":false,\"is_handled\":false,\"is_problem\":true,\"is_reachable\":true,\"is_sticky_acknowledgement\":false,\"last_state_change\":1758547833522,\"last_update\":1758547833522,\"latency\":0,\"next_check\":1758548133522,\"next_update\":1758548193522,\"normalized_performance_data\":\"wrong=85;0;0;0 ok=1;0;0;0 max=215.600000;0;0;0\",\"output\":\" /dev/dm-13: 1.0MiB/s read1, 4.3KiB/s write1, 1.0MiB/s total, 215.6MiB/s max\",\"performance_data\":\"wrong=85;0;0;0; ok=1;0;0;0; max=215.6;0;0;0;\",\"previous_hard_state\":0,\"previous_soft_state\":0,\"scheduling_source\":\"mbp-yhabteab.fritz.box\",\"severity\":2080,\"soft_state\":1,\"state_type\":\"soft\"}"
1758547833.559289 [12 172.18.0.1:57800] "HSET" "icinga:checksum:host:state" "4c80e4ad47034480b3c3bb28885970b6bb50983e" "{\"checksum\":\"0ed63c1311be890a7ad5ffae340a3489db7d57ca\"}"
1758547833.559305 [12 172.18.0.1:57800] "ZADD" "icinga:nextupdate:host" "1758548193.522986" "4c80e4ad47034480b3c3bb28885970b6bb50983e"
1758547833.559430 [12 172.18.0.1:57800] "XADD" "icinga:runtime:state" "MAXLEN" "~" "1000000" "*" "runtime_type" "upsert" "redis_key" "icinga:host:state" "checksum" "0ed63c1311be890a7ad5ffae340a3489db7d57ca" "affects_children" "0" "check_attempt" "1" "check_source" "mbp-yhabteab.fritz.box" "check_timeout" "60000" "environment_id" "30e34a7a6c432baf29e051ed428aee3825f25e67" "execution_time" "0" "hard_state" "0" "host_id" "4c80e4ad47034480b3c3bb28885970b6bb50983e" "id" "4c80e4ad47034480b3c3bb28885970b6bb50983e" "in_downtime" "0" "is_acknowledged" "0" "is_active" "1" "is_flapping" "0" "is_handled" "0" "is_problem" "1" "is_reachable" "1" "is_sticky_acknowledgement" "0" "last_state_change" "1758547833522" "last_update" "1758547833522" "latency" "0" "next_check" "1758548133522" "next_update" "1758548193522" "normalized_performance_data" "wrong=85;0;0;0 ok=1;0;0;0 max=215.600000;0;0;0" "output" " /dev/dm-13: 1.0MiB/s read1, 4.3KiB/s write1, 1.0MiB/s total, 215.6MiB/s max" "performance_data" "wrong=85;0;0;0; ok=1;0;0;0; max=215.6;0;0;0;" "previous_hard_state" "0" "previous_soft_state" "0" "scheduling_source" "mbp-yhabteab.fritz.box" "severity" "2080" "soft_state" "1" "state_type" "soft"

fixes #10082

lib/icinga/clusterevents.cpp

jschmidt-icinga

So basically, the DB will only be updated in these cases:

In the non-forced External Command Pipe commands
Unconditionally via the RescheduleCheck ApiAction
When scheduling a check that is not forced, is active, and enable_service_checks is true

In all other cases we just trigger internal state-changes.

I'm not sure the SetNextCheck() is the right place to make the decision about which event to persist to the DB, but I get that this is a low impact change that fixes the issue, and so I don't want to restart the bikeshedding from #10082.

jschmidt-icinga · 2025-09-23T10:33:24Z

lib/checker/checkercomponent.cpp

-			if (host && !service && (!checkable->GetEnableActiveChecks() || !icingaApp->GetEnableHostChecks())) {
+			activeChecksDisabled = !checkable->GetEnableActiveChecks() || !icingaApp->GetEnableServiceChecks();
+			if (activeChecksDisabled && host && !service) {
 				Log(LogNotice, "CheckerComponent")
 					<< "Skipping check for host '" << host->GetName() << "': active host checks are disabled";
 				check = false;
-			}
-			if (host && service && (!checkable->GetEnableActiveChecks() || !icingaApp->GetEnableServiceChecks())) {
+			} else if (activeChecksDisabled) {


Can you please explain this diff a bit more?

Why are we now checking icingaApp->GetEnableServiceChecks() in the host case? Or is that a mistake?

Is this change meaningful for what this PR does or just a small refactoring since we need the information for the activeChecksDisabled flag?

Also, since we're touching this anway, maybe it can be simplified to have less duplication, since all that's different is the word "host"/"service" in the log message.

lib/icinga/clusterevents.cpp

This commit changes the ordering of CheckableScheduleInfo in the multi-index container to ensure that checkables with running checks are pushed to the end of the ordering. This prevents them from being prioritized for scheduling ahead of others, which could lead to unnecessary CPU load due to repeated scheduling attempts. By using a very large value for the index of checkables with running checks, they are effectively deprioritized until their current check is completed and they can be reinserted with their actual next check time.

Since the scheduler accounts for already running checks, we only need to update the `next_check` timestamp in `Checkable::ExecuteCheck()` only where it actually makes sense to do so, and as for local checks this doesn't make sense at all. There only two cases where we need to update the next check beforehand: 1) The execute command event is sent to a connected remote endpoint, so we need to set the next check to a time in the future until we actually receive the check result back from the remote endpoint. However, it must not be too far in the future to avoid that the check is not re-run for too long in case the remote endpoint never responds. 2) The check is a remote check, but either the endpoint is currently syncing replay logs or not connected at all, and we are within the magical 5min cold startup window. In these cases, the check is effectively skipped, and there will be no check result for it coming in, we manually update the next check normally as if the check was executed. In the other cases, either the check is executed locally, which means the `m_RunningCheck` flag already prevents the scheduler from re-running the check, or this is a remote check and the endpoint is not connected, but we are outside the cold startup window, in which case we also don't do anything as we've already called `Checkable::ProcessCheckResult()` with an appropriate error state, which in turn will call `Checkable::UpdateNextCheck()`.

yhabteab · 2025-09-24T09:54:12Z

I've changed the implementation according to our offline discussion with @jschmidt-icinga yesterday and with a bunch of comments added, the change list is a bit bigger than before but it's way better, I think. So please have a look!

PS: I've also updated the PR description!

jschmidt-icinga · 2025-09-24T11:31:27Z

Though, since this change affects a critical part of Icinga 2, we should be very careful with this not to introduce any regressions or performance degradations.

Would it be possible to add test-cases for all permutations of state changes? The checker component doesn't have any dependencies on its own and you could make a simple mock object connecting to the same events the databases receive. Then reschedule checks on checkables and verify everything is updated and consistent.

Also note that Unit-Tests act as a kind of documentation too. Well commented tests sometimes describe what that code does better than the code itself. (Speaking as someone who still doesn't grasp the entirety of check scheduling in its complexity)

Edit: To clarify: I don't mean a full unit-test suite for the entire checker component, though that might be useful in the future, but for now, maybe just a few cases for the specific issue fixed in this PR.

This commit introduces a new kinda special `OnRescheduleCheck` signal that is emitted whenever we want to inform the checker to reschedule the checkable at a specific timestamp without actually changing the next check time. Previously, we called `SetNextCheck` with some random timestamp just to enforce the checker to either pick it up immediately or at a specific time. Then at some point in time, subscribing to the `OnNextCheckChanged` signal became effectively unusable for any other purpose than to inform the checker about a new next check time. Thus, it resulted in introducing a new signal that is solely responsible for informing the Icigna DB and IDO about a new next check time in places where calling `SetNextCheck` did make sense. This commit does quite the opposite: it replaces all calls to `SetNextCheck` that were only used to inform the checker about a new next check time wit `OnRescheduleCheck` calls. Only places where we actually wanted to change the next check time still call `SetNextCheck` and thus inform the checker and all other listeners about the new next check time. And as a bonus point, we now got rid of the two object locks for child and parent at the same time.

It also removes the extra `SendNextUpdate()` call from the `NewCheckResultHandler` handler in Icinga DB, since it's subscribed to the `NextCheckChanged` event anyway and that event is always emitted before the `NewCheckResult` event gets triggered. This call became redundant.

yhabteab · 2025-09-25T08:05:46Z

I've added basic unittests for the checker now as requested, so the PR size has drastically increased now :)! But I hope, the tests gives you now a little overview how the checker works or is actually supposed to work.

jschmidt-icinga

First of all, amazing work adding these tests with all that infrastructure in what must only have been a few hours of time.

I'm not at all concerned with the increase in diff size due to the tests, as it is pretty clear that the production code actually gets a reduction in complexity from this PR.

For now I'm only going to comment on some of the test methods used, but I'm also continuing to look at what they do and the rest of the PR, which looks very promising.

test/base-testloggerfixture.hpp

test/checker-fixture.hpp

test/checker.cpp

If the logger is started with `Activate()` before `SetActive()`, it won't log anything, as the logger updates the "min severity" value of loggers only when starting them, and if they're not active at that point, they will just be ignored, so the min severity remains at info.

yhabteab · 2025-09-25T15:54:24Z

I think, I've addressed all your requested changes. Plus, our checker should now also have a roughly 70-90% code coverage. I've added some more unit tests to cover for remotely executed checks aka command_endpoint.

Al2Klimov

Test: redis-cli MONITOR with one check.

`c2c9b6b`

1759312891.029505 [0 192.168.65.1:60842] "HSET" "icinga:host:state" "d7e93a933920b2a35baf2479691400a961d92311" "{\"affects_children\":false,\"check_attempt\":1,\"check_commandline\":\"'dummy'\",\"check_source\":\"ws-aklimov7777777.local\",\"check_timeout\":60000,\"environment_id\":\"824bc16396c1df319c0a1e50a335de0df3d2ce9d\",\"execution_time\":0,\"hard_state\":0,\"host_id\":\"d7e93a933920b2a35baf2479691400a961d92311\",\"id\":\"d7e93a933920b2a35baf2479691400a961d92311\",\"in_downtime\":false,\"is_acknowledged\":false,\"is_active\":true,\"is_flapping\":false,\"is_handled\":false,\"is_problem\":false,\"is_reachable\":true,\"is_sticky_acknowledgement\":false,\"last_state_change\":1759312863888,\"last_update\":1759312891027,\"latency\":7,\"next_check\":1759312901019,\"next_update\":1759312911035,\"output\":\"Check was successful.\",\"previous_hard_state\":99,\"previous_soft_state\":99,\"scheduling_source\":\"ws-aklimov7777777.local\",\"severity\":0,\"soft_state\":0,\"state_type\":\"hard\"}"
1759312891.029533 [0 192.168.65.1:60842] "HSET" "icinga:checksum:host:state" "d7e93a933920b2a35baf2479691400a961d92311" "{\"checksum\":\"acb9e73a6beec90b4ed53726d693d85f30085e77\"}"
1759312891.029545 [0 192.168.65.1:60842] "ZADD" "icinga:nextupdate:host" "1759312911.035350" "d7e93a933920b2a35baf2479691400a961d92311"

`5f3b7d2`

1759313097.299299 [0 192.168.65.1:32736] "HSET" "icinga:host:state" "d7e93a933920b2a35baf2479691400a961d92311" "{\"affects_children\":false,\"check_attempt\":1,\"check_commandline\":\"'dummy'\",\"check_source\":\"ws-aklimov7777777.local\",\"check_timeout\":60000,\"environment_id\":\"824bc16396c1df319c0a1e50a335de0df3d2ce9d\",\"execution_time\":0,\"hard_state\":0,\"host_id\":\"d7e93a933920b2a35baf2479691400a961d92311\",\"id\":\"d7e93a933920b2a35baf2479691400a961d92311\",\"in_downtime\":false,\"is_acknowledged\":false,\"is_active\":true,\"is_flapping\":false,\"is_handled\":false,\"is_problem\":false,\"is_reachable\":true,\"is_sticky_acknowledgement\":false,\"last_state_change\":1759312863888,\"last_update\":1759313097297,\"latency\":6,\"next_check\":1759313106638,\"next_update\":1759313116651,\"output\":\"Check was successful.\",\"previous_hard_state\":99,\"previous_soft_state\":99,\"scheduling_source\":\"ws-aklimov7777777.local\",\"severity\":0,\"soft_state\":0,\"state_type\":\"hard\"}"
1759313097.299331 [0 192.168.65.1:32736] "HSET" "icinga:checksum:host:state" "d7e93a933920b2a35baf2479691400a961d92311" "{\"checksum\":\"6a8714aa442b2d076193cfa3baf4603a35cf5fec\"}"
1759313097.299346 [0 192.168.65.1:32736] "ZADD" "icinga:nextupdate:host" "1759313116.651438" "d7e93a933920b2a35baf2479691400a961d92311"
1759313097.300295 [0 192.168.65.1:32736] "HSET" "icinga:host:state" "d7e93a933920b2a35baf2479691400a961d92311" "{\"affects_children\":false,\"check_attempt\":1,\"check_commandline\":\"'dummy'\",\"check_source\":\"ws-aklimov7777777.local\",\"check_timeout\":60000,\"environment_id\":\"824bc16396c1df319c0a1e50a335de0df3d2ce9d\",\"execution_time\":0,\"hard_state\":0,\"host_id\":\"d7e93a933920b2a35baf2479691400a961d92311\",\"id\":\"d7e93a933920b2a35baf2479691400a961d92311\",\"in_downtime\":false,\"is_acknowledged\":false,\"is_active\":true,\"is_flapping\":false,\"is_handled\":false,\"is_problem\":false,\"is_reachable\":true,\"is_sticky_acknowledgement\":false,\"last_state_change\":1759312863888,\"last_update\":1759313097297,\"latency\":6,\"next_check\":1759313106638,\"next_update\":1759313116651,\"output\":\"Check was successful.\",\"previous_hard_state\":99,\"previous_soft_state\":99,\"scheduling_source\":\"ws-aklimov7777777.local\",\"severity\":0,\"soft_state\":0,\"state_type\":\"hard\"}"
1759313097.300320 [0 192.168.65.1:32736] "HSET" "icinga:checksum:host:state" "d7e93a933920b2a35baf2479691400a961d92311" "{\"checksum\":\"6a8714aa442b2d076193cfa3baf4603a35cf5fec\"}"

Conclusion

Smells like #9501, I didn't even test state changes yet.

Consider a dead simple additional OnNextCheckUpdated cluster event instead.

Al2Klimov · 2025-10-01T09:07:36Z

lib/icinga/checkable-check.cpp

+			 * the magical 5min cold startup window. In both cases, we just don't do anything and wait for
+			 * the next check interval to re-try the check again. So, this check is effectively skipped.
+			 */
+			UpdateNextCheck();


Too bad if my check runs once a day. (The else if case at least schedules the next check just one retry interval in the future.)

Al2Klimov · 2025-10-01T09:26:50Z

lib/icinga/checkable-check.cpp

 			if (child->GetProblem() && child->GetEnableActiveChecks()) {
-				auto nextCheck (now + Utility::Random() % 60);
-
-				ObjectLock oLock (child);


Sure the lock is no longer needed? Below you still have a get and kind of a set operation.

Al2Klimov · 2025-10-01T09:29:53Z

lib/icinga/checkable-check.cpp

 				continue;

 			if (parent->GetNextCheck() >= now + parent->GetRetryInterval()) {
-				ObjectLock olock(parent);


Same here. While on it, I guess it makes sense for the lock to cover both operations, get and set. Just as in the child case above.

Al2Klimov · 2025-10-01T09:33:24Z

lib/icinga/checkable.cpp

 		delta *= (double)std::rand() / RAND_MAX;
-		SetNextCheck(now + delta);
+		// We only want to jitter the next check a bit, and inform the scheduler about it, so not setting it directly.
+		SetNextCheck(now + delta, true);


To "inform the scheduler about it" you suppress the signal?

Al2Klimov · 2025-10-01T09:35:43Z

lib/db_ido/dbevents.cpp

 	});

-	Checkable::OnNextCheckUpdated.connect([](const Checkable::Ptr& checkable) { NextCheckUpdatedHandler(checkable); });
+	Checkable::OnNextCheckChanged.connect([](const Checkable::Ptr& checkable, const Value&) { NextCheckUpdatedHandler(checkable); });


Seems you missed to rename NextCheckUpdatedHandler.

Al2Klimov · 2025-10-01T09:42:44Z

lib/icinga/clusterevents.cpp


-	checkable->SetNextCheck(params->Get("next_check"), false, origin);
+	bool isOriginClientOld = params->Get("origin_client_old").ToBool();
+	if (isOriginClientOld || endpoint->GetIcingaVersion() < 21600) { // TODO: Exact required version??


You don't have to worry about exact versions if you introduce a new capability.

Al2Klimov · 2025-10-01T09:51:15Z

lib/icinga/checkable-check.cpp

-	 * queues and ensures that checks are not fired multiple times. ProcessCheckResult()
-	 * is called too late. See #6421.
-	 */
-	UpdateNextCheck();


Why is this no longer necessary, especially in the local check case? Sure, now you return early if !!m_CheckRunning. But where is this checkable re-indexed inside the scheduler queue?

yhabteab requested review from jschmidt-icinga and julianbrost September 22, 2025 13:36

cla-bot bot added the cla/signed label Sep 22, 2025

yhabteab added bug Something isn't working area/distributed Distributed monitoring (master, satellites, clients) ref/IP labels Sep 22, 2025

Al2Klimov reviewed Sep 22, 2025

View reviewed changes

lib/icinga/clusterevents.cpp Outdated Show resolved Hide resolved

jschmidt-icinga reviewed Sep 23, 2025

View reviewed changes

yhabteab force-pushed the next-update-fix branch from d8c1d07 to f8a4acb Compare September 23, 2025 16:59

yhabteab removed the request for review from julianbrost September 24, 2025 07:09

yhabteab marked this pull request as draft September 24, 2025 07:09

yhabteab added 2 commits September 24, 2025 10:15

yhabteab force-pushed the next-update-fix branch from f8a4acb to c7775eb Compare September 24, 2025 09:28

yhabteab marked this pull request as ready for review September 24, 2025 09:54

yhabteab requested review from jschmidt-icinga and julianbrost September 24, 2025 09:54

yhabteab added 4 commits September 24, 2025 16:00

Drop the now superfluous OnNextCheckUpdated signal

a58a69f

ClusterEvents: add special special handling for SetNextCheck events

9a5fd4b

yhabteab force-pushed the next-update-fix branch from c7775eb to 0bcee67 Compare September 25, 2025 07:55

jschmidt-icinga requested changes Sep 25, 2025

View reviewed changes

yhabteab added 4 commits September 25, 2025 17:48

tests: raise Concurrency config in global app fixture

62a43a6

checker: make result timer interval configurable for testing

39c1d10

test: add basic checker scheduling test cases

5f3b7d2

yhabteab force-pushed the next-update-fix branch from 0bcee67 to 5f3b7d2 Compare September 25, 2025 15:48

yhabteab requested a review from jschmidt-icinga September 25, 2025 15:54

Al2Klimov requested changes Oct 1, 2025

View reviewed changes

Fix overdue state doesn't account for timeperiods in HA cluster #10562

Are you sure you want to change the base?

Fix overdue state doesn't account for timeperiods in HA cluster #10562

Uh oh!

Conversation

yhabteab commented Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tests

Uh oh!

Uh oh!

jschmidt-icinga left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yhabteab commented Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jschmidt-icinga commented Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yhabteab commented Sep 25, 2025

Uh oh!

jschmidt-icinga left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yhabteab commented Sep 25, 2025

Uh oh!

Al2Klimov left a comment

Choose a reason for hiding this comment

Conclusion

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yhabteab commented Sep 22, 2025 •

edited

Loading

yhabteab commented Sep 24, 2025 •

edited

Loading

jschmidt-icinga commented Sep 24, 2025 •

edited

Loading