chore(release): v100.0.6a - continuous tick ~1s, docs updates

Eidmantas Ivanauskas · Eidmantas Ivanauskas · commit b5883c46b186 · 2025-09-19T09:42:44.000+03:00
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -6,6 +6,18 @@ handling on your side.
 
 ## Unreleased
 
+## 2025-09-19 - Release v100.0.6a
+
+### Changes
+
+- scheduler(continuous): increase tick cadence from ~100ms to ~1s to reduce log noise and API/list pressure; no change to per-app `--interval` gating
+- docs(readme): remove Mermaid diagram; add ASCII architecture; add rate limiting/backpressure section; add phase comparison table (stock vs tuned)
+
+### Notes
+
+- Behavior impact: only the scheduler’s discovery cadence changes; application dispatch still respects `--interval`, in-flight guards, fairness (LRU/fail-first, cooldown, per-repo-cap), and concurrency caps.
+- Recommended: if startup delay is undesirable, run with `--warmup-cache=false`.
+
 ### Upgrade notes (no really, you MUST read this)
 
 * **Attention**: By default, `argocd-image-updater` now uses the K8s API to retrieve applications, instead of the Argo CD API. Also, it is now recommended to install in the same namespace as Argo CD is running in (`argocd` by default). For existing installations, which are running in a dedicated namespace.
diff --git a/README.md b/README.md
@@ -194,81 +194,119 @@ Notes
 - Combined effects (scheduler + limits)
   - Higher `--max-concurrency` with `--per-repo-cap` and `--cooldown` improves fleet throughput and fairness while avoiding monorepo starvation.
 
-## Enhanced runtime architecture (fork-specific)
-
-This diagram shows how our improvements compose when running with the recommended flags (e.g., `--mode=continuous`, `--max-concurrency=0`, `--schedule=lru|fail-first`, `--cooldown`, `--per-repo-cap`, with `REGISTRY_JWT_*` retries enabled).
-
-```mermaid
-flowchart LR
-  subgraph Scheduler[Scheduler]
-    S1[--mode=continuous] --> S2[Per-app timers (interval)]
-    S1 --> S3[Auto concurrency (--max-concurrency=0)]
-    S1 --> S4[Priorities (--schedule=lru|fail-first)]
-    S1 --> S5[Cooldown (--cooldown)]
-    S1 --> S6[Fairness cap (--per-repo-cap)]
-  end
-
-  S2 --> WQ[Worker Pool]
-  S3 --> WQ
-  S4 --> WQ
-  S5 --> WQ
-  S6 --> WQ
-
-  subgraph Worker[Worker (per app)]
-    W1[Compute image candidates] --> W2[Registry operations]
-    W1 --> W3[Spec patch in-memory]
-    W3 -->|WriteBackGit| W4[enqueue writeIntent(repo)]
-    W3 -->|WriteBack Argo CD API| W5[Update via ArgoCD API]
-  end
-
-  WQ --> W1
-
-  %% Registry client path
-  subgraph RegistryClient[Registry client (per registry endpoint)]
-    RC1[Dedicated http.Transport cache\nkeep-alive + sane timeouts\nMaxConnsPerHost cap] --> RC2[rateLimit transport\nper-registry in-flight cap]
-    RC2 --> RC3[singleflight(tags, manifests)]
-    RC3 --> RC4[JWT auth interceptor\n singleflight + retries\n REGISTRY_JWT_*]
-    RC4 --> RC5[HTTPS (HTTP/2 when available)]
-  end
-
-  W2 --> RegistryClient
-
-  %% Batched Git writer path
-  subgraph GitWriter[Per-repo batched writer]
-    GW1[intent queue (repo)] --> GW2[group by branch]
-    GW2 --> GW3[commitBatch]
-    GW3 --> GW4[Git client (fetch/checkout/commit/push)\nretries + backoff]
-  end
-
-  W4 --> GitWriter
-
-  %% Metrics and logging
-  subgraph Metrics[Prometheus metrics]
-    M1[App timings\n(last attempt/success, duration)]
-    M2[Cycle timings]
-    M3[Registry health\n(in-flight, duration, status, retries, errors)]
-    M4[JWT auth\n(requests, errors, duration, TTL)]
-    M5[Singleflight effectiveness]
-  end
-
-  Worker --> M1
-  Scheduler --> M2
-  RegistryClient --> M3
-  RegistryClient --> M4
-  RegistryClient --> M5
-  GitWriter --> M1
+## Rate limiting and backpressure
+
+This fork adds layered controls to protect upstreams and the process under load:
+
+- Global worker pool
+  - Controlled by `--max-concurrency` (or auto with `0`). Limits total concurrent app updates.
+
+- Per-registry request rate (token bucket)
+  - Configured via `registries.conf` `limit` per registry (requests/second).
+  - Requests beyond the budget are delayed locally to smooth spikes; reduces 429/timeouts.
+
+- Per-registry in-flight cap
+  - Socket-level caps via HTTP transport (`MaxConnsPerHost`) plus internal semaphores where applicable.
+  - Prevents connection storms and ephemeral port exhaustion.
+
+- Singleflight de-duplication
+  - Tags/manifests and JWT auth are de-duplicated. One leader performs the call, followers wait for the result.
+  - Cuts redundant upstream traffic during bursts.
+
+- Jittered exponential backoff retries
+  - Applied to tags/manifests and JWT auth. Short, bounded retries with jitter to avoid synchronization.
+
+- Git backpressure (batched writer)
+  - Per-repo queue serializes commit/push; multiple app intents per branch coalesce into one commit.
+  - Retries with backoff for transient fetch/push errors.
+
+- Fair scheduling
+  - `--per-repo-cap` limits apps from one repo per cycle; `--cooldown` deprioritizes recently updated apps.
+
+Observability:
+- Metrics expose queue lengths, in-flight counts, retry counts, singleflight leader/follower, and durations to tune the above without guesswork.
+
+## ASCII architecture (fork-specific)
+
+The same runtime, depicted in ASCII for environments without Mermaid rendering.
+
+```
+                               +-----------------------------------------+
+                               |               Scheduler                 |
+                               |-----------------------------------------|
+  flags: --mode=continuous     |  per-app timers (interval)              |
+         --max-concurrency=0   |  auto concurrency sizing                |
+         --schedule=lru|fail   |  LRU / Fail-first prioritization        |
+         --cooldown=30s        |  cooldown to dampen hot apps            |
+         --per-repo-cap=20     |  fairness cap per Git repo per pass     |
+                               +--------------------+--------------------+
+                                                     |
+                                                     v
+                                      +--------------+--------------+
+                                      |           Worker Pool       |
+                                      +--------------+--------------+
+                                                     |
+                                                     v
+                                         +-----------+-----------+
+                                         |   Worker (per app)   |
+                                         |----------------------|
+                                         | 1) Compute images    |
+                                         | 2) Registry ops      |
+                                         | 3) Patch spec in mem |
+                                         | 4a) WriteBack=Git -> |----+
+                                         |     enqueue intent   |    |
+                                         | 4b) WriteBack=ArgoCD |    |
+                                         |     Update via API   |    |
+                                         +----------------------+    |
+                                                                       |
+                                                                       v
+                             +-----------------------------------+     |
+                             |    Registry Client (per endpoint) |     |
+                             |-----------------------------------|     |
+                             | Transport cache (keep-alive)      |     |
+                             | Sane timeouts, MaxConnsPerHost    |     |
+                             | Per-reg in-flight cap (queue)     |     |
+                             | Singleflight: tags/manifests      |     |
+                             | JWT auth: singleflight + retries  |     |
+                             | HTTP/2 over TLS when available    |     |
+                             +------------------+----------------+     |
+                                                |                      |
+                                                v                      |
+                                   +------------+-----------+          |
+                                   |  Remote registry/API   |          |
+                                   +------------------------+          |
+                                                                       |
+                                                                       v
+                 +-------------------------------------------------------------+
+                 |           Per-repo Batched Git Writer                      |
+                 |-------------------------------------------------------------|
+                 | intent queue (repo)  ->  group by branch  ->  commitBatch  |
+                 | fetch/checkout/commit/push (retries/backoff)                |
+                 +----------------------------+--------------------------------+
+                                              |
+                                              v
+                                       +------+------+
+                                       |    Remote   |
+                                       |     Git     |
+                                       +-------------+
+
+Observability:
+- Metrics: app timings (last attempt/success, durations), cycle duration, registry in-flight/duration/status/retries/errors,
+  JWT auth (requests/errors/duration/TTL), singleflight leader/follower counts.
+- Logs: startup settings; per-app "continuous: start/finished"; queued write-backs; Git and registry error details.
 ```
 
-Key behaviors added vs upstream:
-
-- Continuous scheduling: per-app timers remove whole-cycle stalls; auto concurrency scales workers.
-- Fairness: `--per-repo-cap` and `--cooldown` prevent hot monorepos from starving others.
-- Registry resiliency and efficiency:
-  - Reused transports with tuned timeouts limit dials and hangs.
-  - Per-registry in-flight cap stops connection storms/port exhaustion.
-  - Singleflight for tags/manifests and JWT auth dedupes bursts.
-  - JWT auth retries with backoff (config via `REGISTRY_JWT_*`).
-  - HTTP/2 when using HTTPS reduces socket pressure under load.
-- Git throughput: per-repo batched writer coalesces multiple app changes per branch into 1 commit+push; Git ops have retries.
-- Clear logs: startup settings, continuous start/finish, queued write-backs.
-- Expanded metrics: deep visibility for scheduling, registry/JWT, and Git paths.
+### Phase comparison: stock vs our tuned configuration
+
+| Phase | Stock defaults (cycle mode, basic concurrency) | Tuned configuration (continuous, auto concurrency, LRU, cooldown, per-repo-cap, singleflight, retries) |
+| --- | --- | --- |
+| Startup | Minimal logging; default transports; limited tuning | Logs full settings; shared transports with timeouts; metrics/health; optional warmup |
+| Scheduling | Global pass every `--interval`; fixed concurrency | Lightweight pass ~1s; per-app due check against `--interval`; auto concurrency sizing |
+| Discovery/filter | List apps every pass; warn on unsupported each pass | Same listing; will throttle/dedupe repeated unsupported warnings; same filters |
+| Prioritization | Default order | LRU or Fail-first; cooldown deprioritizes recent successes; per-repo-cap fairness |
+| Dispatch | Semaphore up to `--max-concurrency` | Same guard; plus per-app in-flight guard to avoid double dispatch in continuous |
+| Registry IO | Direct calls; limited retry semantics | Per-reg RPS limiter and in-flight cap; singleflight for tags/manifests and JWT; jittered backoff retries; shared transports; HTTP/2 |
+| Update decision | Compare live vs candidate; may skip | Same logic, but less flap due to fairness/cooldown |
+| Write-back | Immediate Git per app (can thrash in monorepos) | Per-repo batched writer; group by branch; one commit/push per batch; retries |
+| Non-Git write-back | ArgoCD `UpdateSpec` | Same, with conflict-retry backoff |
+| Observability | Basic metrics/logs | Expanded metrics (JWT, singleflight, durations); per-app continuous start/finish logs; queue and retry metrics |
diff --git a/VERSION b/VERSION
@@ -1 +1 @@
-99.9.9
+100.0.6a
diff --git a/cmd/run.go b/cmd/run.go
@@ -337,11 +337,11 @@ func newRunCommand() *cobra.Command {
 					log.Errorf("Webhook server exited with error: %v", err)
 					return nil
 				default:
-					if cfg.Mode == "continuous" {
-						runContinuousOnce(cfg)
-						// continuous scheduler loops internally; we just sleep briefly
-						time.Sleep(100 * time.Millisecond)
-					} else {
+                    if cfg.Mode == "continuous" {
+                        runContinuousOnce(cfg)
+                        // continuous scheduler loops internally; tick at ~1s
+                        time.Sleep(1 * time.Second)
+                    } else {
 						if lastRun.IsZero() || time.Since(lastRun) > cfg.CheckInterval {
                             result, err := runImageUpdater(cfg, false)
 							if err != nil {
@@ -361,7 +361,7 @@ func newRunCommand() *cobra.Command {
 				if cfg.CheckInterval == 0 {
 					break
 				}
-				time.Sleep(100 * time.Millisecond)
+                time.Sleep(1 * time.Second)
 			}
 			log.Infof("Finished.")
 			return nil
diff --git a/pkg/version/version.go b/pkg/version/version.go
@@ -6,7 +6,7 @@ import (
 )
 
 var (
-	version    = "9.9.99"
+    version    = "100.0.6a"
 	buildDate  = "1970-01-01T00:00:00Z"
 	gitCommit  = "unknown"
 	binaryName = "argocd-image-updater"

Original file line number	Diff line number	Diff line change
`@@ -6,7 +6,7 @@ import (`
`6`	`6`	`)`
`7`	`7`
`8`	`8`	`var (`
`9`		`- version = "9.9.99"`
	`9`	`+ version = "100.0.6a"`
`10`	`10`	`buildDate = "1970-01-01T00:00:00Z"`
`11`	`11`	`gitCommit = "unknown"`
`12`	`12`	`binaryName = "argocd-image-updater"`