Skip to content

Commit b5883c4

Browse files
author
Eidmantas Ivanauskas
committed
chore(release): v100.0.6a - continuous tick ~1s, docs updates
1 parent 7c13d33 commit b5883c4

File tree

5 files changed

+134
-84
lines changed

5 files changed

+134
-84
lines changed

CHANGELOG.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,18 @@ handling on your side.
66

77
## Unreleased
88

9+
## 2025-09-19 - Release v100.0.6a
10+
11+
### Changes
12+
13+
- scheduler(continuous): increase tick cadence from ~100ms to ~1s to reduce log noise and API/list pressure; no change to per-app `--interval` gating
14+
- docs(readme): remove Mermaid diagram; add ASCII architecture; add rate limiting/backpressure section; add phase comparison table (stock vs tuned)
15+
16+
### Notes
17+
18+
- Behavior impact: only the scheduler’s discovery cadence changes; application dispatch still respects `--interval`, in-flight guards, fairness (LRU/fail-first, cooldown, per-repo-cap), and concurrency caps.
19+
- Recommended: if startup delay is undesirable, run with `--warmup-cache=false`.
20+
921
### Upgrade notes (no really, you MUST read this)
1022

1123
* **Attention**: By default, `argocd-image-updater` now uses the K8s API to retrieve applications, instead of the Argo CD API. Also, it is now recommended to install in the same namespace as Argo CD is running in (`argocd` by default). For existing installations, which are running in a dedicated namespace.

README.md

Lines changed: 114 additions & 76 deletions
Original file line numberDiff line numberDiff line change
@@ -194,81 +194,119 @@ Notes
194194
- Combined effects (scheduler + limits)
195195
- Higher `--max-concurrency` with `--per-repo-cap` and `--cooldown` improves fleet throughput and fairness while avoiding monorepo starvation.
196196

197-
## Enhanced runtime architecture (fork-specific)
198-
199-
This diagram shows how our improvements compose when running with the recommended flags (e.g., `--mode=continuous`, `--max-concurrency=0`, `--schedule=lru|fail-first`, `--cooldown`, `--per-repo-cap`, with `REGISTRY_JWT_*` retries enabled).
200-
201-
```mermaid
202-
flowchart LR
203-
subgraph Scheduler[Scheduler]
204-
S1[--mode=continuous] --> S2[Per-app timers (interval)]
205-
S1 --> S3[Auto concurrency (--max-concurrency=0)]
206-
S1 --> S4[Priorities (--schedule=lru|fail-first)]
207-
S1 --> S5[Cooldown (--cooldown)]
208-
S1 --> S6[Fairness cap (--per-repo-cap)]
209-
end
210-
211-
S2 --> WQ[Worker Pool]
212-
S3 --> WQ
213-
S4 --> WQ
214-
S5 --> WQ
215-
S6 --> WQ
216-
217-
subgraph Worker[Worker (per app)]
218-
W1[Compute image candidates] --> W2[Registry operations]
219-
W1 --> W3[Spec patch in-memory]
220-
W3 -->|WriteBackGit| W4[enqueue writeIntent(repo)]
221-
W3 -->|WriteBack Argo CD API| W5[Update via ArgoCD API]
222-
end
223-
224-
WQ --> W1
225-
226-
%% Registry client path
227-
subgraph RegistryClient[Registry client (per registry endpoint)]
228-
RC1[Dedicated http.Transport cache\nkeep-alive + sane timeouts\nMaxConnsPerHost cap] --> RC2[rateLimit transport\nper-registry in-flight cap]
229-
RC2 --> RC3[singleflight(tags, manifests)]
230-
RC3 --> RC4[JWT auth interceptor\n singleflight + retries\n REGISTRY_JWT_*]
231-
RC4 --> RC5[HTTPS (HTTP/2 when available)]
232-
end
233-
234-
W2 --> RegistryClient
235-
236-
%% Batched Git writer path
237-
subgraph GitWriter[Per-repo batched writer]
238-
GW1[intent queue (repo)] --> GW2[group by branch]
239-
GW2 --> GW3[commitBatch]
240-
GW3 --> GW4[Git client (fetch/checkout/commit/push)\nretries + backoff]
241-
end
242-
243-
W4 --> GitWriter
244-
245-
%% Metrics and logging
246-
subgraph Metrics[Prometheus metrics]
247-
M1[App timings\n(last attempt/success, duration)]
248-
M2[Cycle timings]
249-
M3[Registry health\n(in-flight, duration, status, retries, errors)]
250-
M4[JWT auth\n(requests, errors, duration, TTL)]
251-
M5[Singleflight effectiveness]
252-
end
253-
254-
Worker --> M1
255-
Scheduler --> M2
256-
RegistryClient --> M3
257-
RegistryClient --> M4
258-
RegistryClient --> M5
259-
GitWriter --> M1
197+
## Rate limiting and backpressure
198+
199+
This fork adds layered controls to protect upstreams and the process under load:
200+
201+
- Global worker pool
202+
- Controlled by `--max-concurrency` (or auto with `0`). Limits total concurrent app updates.
203+
204+
- Per-registry request rate (token bucket)
205+
- Configured via `registries.conf` `limit` per registry (requests/second).
206+
- Requests beyond the budget are delayed locally to smooth spikes; reduces 429/timeouts.
207+
208+
- Per-registry in-flight cap
209+
- Socket-level caps via HTTP transport (`MaxConnsPerHost`) plus internal semaphores where applicable.
210+
- Prevents connection storms and ephemeral port exhaustion.
211+
212+
- Singleflight de-duplication
213+
- Tags/manifests and JWT auth are de-duplicated. One leader performs the call, followers wait for the result.
214+
- Cuts redundant upstream traffic during bursts.
215+
216+
- Jittered exponential backoff retries
217+
- Applied to tags/manifests and JWT auth. Short, bounded retries with jitter to avoid synchronization.
218+
219+
- Git backpressure (batched writer)
220+
- Per-repo queue serializes commit/push; multiple app intents per branch coalesce into one commit.
221+
- Retries with backoff for transient fetch/push errors.
222+
223+
- Fair scheduling
224+
- `--per-repo-cap` limits apps from one repo per cycle; `--cooldown` deprioritizes recently updated apps.
225+
226+
Observability:
227+
- Metrics expose queue lengths, in-flight counts, retry counts, singleflight leader/follower, and durations to tune the above without guesswork.
228+
229+
## ASCII architecture (fork-specific)
230+
231+
The same runtime, depicted in ASCII for environments without Mermaid rendering.
232+
233+
```
234+
+-----------------------------------------+
235+
| Scheduler |
236+
|-----------------------------------------|
237+
flags: --mode=continuous | per-app timers (interval) |
238+
--max-concurrency=0 | auto concurrency sizing |
239+
--schedule=lru|fail | LRU / Fail-first prioritization |
240+
--cooldown=30s | cooldown to dampen hot apps |
241+
--per-repo-cap=20 | fairness cap per Git repo per pass |
242+
+--------------------+--------------------+
243+
|
244+
v
245+
+--------------+--------------+
246+
| Worker Pool |
247+
+--------------+--------------+
248+
|
249+
v
250+
+-----------+-----------+
251+
| Worker (per app) |
252+
|----------------------|
253+
| 1) Compute images |
254+
| 2) Registry ops |
255+
| 3) Patch spec in mem |
256+
| 4a) WriteBack=Git -> |----+
257+
| enqueue intent | |
258+
| 4b) WriteBack=ArgoCD | |
259+
| Update via API | |
260+
+----------------------+ |
261+
|
262+
v
263+
+-----------------------------------+ |
264+
| Registry Client (per endpoint) | |
265+
|-----------------------------------| |
266+
| Transport cache (keep-alive) | |
267+
| Sane timeouts, MaxConnsPerHost | |
268+
| Per-reg in-flight cap (queue) | |
269+
| Singleflight: tags/manifests | |
270+
| JWT auth: singleflight + retries | |
271+
| HTTP/2 over TLS when available | |
272+
+------------------+----------------+ |
273+
| |
274+
v |
275+
+------------+-----------+ |
276+
| Remote registry/API | |
277+
+------------------------+ |
278+
|
279+
v
280+
+-------------------------------------------------------------+
281+
| Per-repo Batched Git Writer |
282+
|-------------------------------------------------------------|
283+
| intent queue (repo) -> group by branch -> commitBatch |
284+
| fetch/checkout/commit/push (retries/backoff) |
285+
+----------------------------+--------------------------------+
286+
|
287+
v
288+
+------+------+
289+
| Remote |
290+
| Git |
291+
+-------------+
292+
293+
Observability:
294+
- Metrics: app timings (last attempt/success, durations), cycle duration, registry in-flight/duration/status/retries/errors,
295+
JWT auth (requests/errors/duration/TTL), singleflight leader/follower counts.
296+
- Logs: startup settings; per-app "continuous: start/finished"; queued write-backs; Git and registry error details.
260297
```
261298

262-
Key behaviors added vs upstream:
263-
264-
- Continuous scheduling: per-app timers remove whole-cycle stalls; auto concurrency scales workers.
265-
- Fairness: `--per-repo-cap` and `--cooldown` prevent hot monorepos from starving others.
266-
- Registry resiliency and efficiency:
267-
- Reused transports with tuned timeouts limit dials and hangs.
268-
- Per-registry in-flight cap stops connection storms/port exhaustion.
269-
- Singleflight for tags/manifests and JWT auth dedupes bursts.
270-
- JWT auth retries with backoff (config via `REGISTRY_JWT_*`).
271-
- HTTP/2 when using HTTPS reduces socket pressure under load.
272-
- Git throughput: per-repo batched writer coalesces multiple app changes per branch into 1 commit+push; Git ops have retries.
273-
- Clear logs: startup settings, continuous start/finish, queued write-backs.
274-
- Expanded metrics: deep visibility for scheduling, registry/JWT, and Git paths.
299+
### Phase comparison: stock vs our tuned configuration
300+
301+
| Phase | Stock defaults (cycle mode, basic concurrency) | Tuned configuration (continuous, auto concurrency, LRU, cooldown, per-repo-cap, singleflight, retries) |
302+
| --- | --- | --- |
303+
| Startup | Minimal logging; default transports; limited tuning | Logs full settings; shared transports with timeouts; metrics/health; optional warmup |
304+
| Scheduling | Global pass every `--interval`; fixed concurrency | Lightweight pass ~1s; per-app due check against `--interval`; auto concurrency sizing |
305+
| Discovery/filter | List apps every pass; warn on unsupported each pass | Same listing; will throttle/dedupe repeated unsupported warnings; same filters |
306+
| Prioritization | Default order | LRU or Fail-first; cooldown deprioritizes recent successes; per-repo-cap fairness |
307+
| Dispatch | Semaphore up to `--max-concurrency` | Same guard; plus per-app in-flight guard to avoid double dispatch in continuous |
308+
| Registry IO | Direct calls; limited retry semantics | Per-reg RPS limiter and in-flight cap; singleflight for tags/manifests and JWT; jittered backoff retries; shared transports; HTTP/2 |
309+
| Update decision | Compare live vs candidate; may skip | Same logic, but less flap due to fairness/cooldown |
310+
| Write-back | Immediate Git per app (can thrash in monorepos) | Per-repo batched writer; group by branch; one commit/push per batch; retries |
311+
| Non-Git write-back | ArgoCD `UpdateSpec` | Same, with conflict-retry backoff |
312+
| Observability | Basic metrics/logs | Expanded metrics (JWT, singleflight, durations); per-app continuous start/finish logs; queue and retry metrics |

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
99.9.9
1+
100.0.6a

cmd/run.go

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -337,11 +337,11 @@ func newRunCommand() *cobra.Command {
337337
log.Errorf("Webhook server exited with error: %v", err)
338338
return nil
339339
default:
340-
if cfg.Mode == "continuous" {
341-
runContinuousOnce(cfg)
342-
// continuous scheduler loops internally; we just sleep briefly
343-
time.Sleep(100 * time.Millisecond)
344-
} else {
340+
if cfg.Mode == "continuous" {
341+
runContinuousOnce(cfg)
342+
// continuous scheduler loops internally; tick at ~1s
343+
time.Sleep(1 * time.Second)
344+
} else {
345345
if lastRun.IsZero() || time.Since(lastRun) > cfg.CheckInterval {
346346
result, err := runImageUpdater(cfg, false)
347347
if err != nil {
@@ -361,7 +361,7 @@ func newRunCommand() *cobra.Command {
361361
if cfg.CheckInterval == 0 {
362362
break
363363
}
364-
time.Sleep(100 * time.Millisecond)
364+
time.Sleep(1 * time.Second)
365365
}
366366
log.Infof("Finished.")
367367
return nil

pkg/version/version.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ import (
66
)
77

88
var (
9-
version = "9.9.99"
9+
version = "100.0.6a"
1010
buildDate = "1970-01-01T00:00:00Z"
1111
gitCommit = "unknown"
1212
binaryName = "argocd-image-updater"

0 commit comments

Comments
 (0)