fix(flows): flow-16 unsatisfiable Ready gate; flow-04 first-inference window on local models#621
Closed
bussyjd wants to merge 1 commit into
Closed
fix(flows): flow-16 unsatisfiable Ready gate; flow-04 first-inference window on local models#621bussyjd wants to merge 1 commit into
bussyjd wants to merge 1 commit into
Conversation
…ference probe
Both release-smoke failures on the rc14 wopus run reduce to these two
flow bugs — no stack defect (reproduced live, full chain diagnosed):
- flow-16 §2.2 polled Ready=True for an offer created WITH registration
enabled and no `obol sell register` submitted, which the controller
keeps Ready=False / AwaitingExternalRegistration by design ('offer
already serves paid traffic') — the gate could never pass as written,
and only ever matched historically because 'Ready=True' substring-
matched 'PaymentGateReady=True' when the ladder converged in time.
Gate now polls the serving condition set (UpstreamHealthy +
PaymentGateReady + RoutePublished, anchored greps) over 300s, which is
exactly what §3's 402 probe exercises.
- flow-04 step 12 used `curl -sf --max-time 120`: too tight for the
FIRST inference ever routed through the Hermes agent pipeline on a
local Ollama model (the multi-thousand-token system prompt pays full
prompt processing before the KV cache warms; ~150s observed for a 27B
on an M-series host), and -f swallowed every diagnostic so the fail
message was empty. Now 300s, no -f, and the fail message carries the
HTTP status + body snippet.
Verified against a live cluster in the failing state: the new flow-16
gate passes where the old one cannot; the flow-04 call returns 200 with
correct content once warm.
b62dfe8 to
8f5c6bb
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Both release-gating FAILs from the rc14 release smoke on a local Ollama model (
qwopus3.6-27b-v2-mtp:q5_k_m, 27B) reduce to flow bugs — the stack itself is clean. Diagnosed live against a reproducing cluster; full chain below.flow-16 §2.2 — the Ready gate could never pass
The flow creates its offer with registration enabled and never runs
obol sell register, so the controller keepsReady=False / Registered=AwaitingExternalRegistrationby design ("offer already serves paid traffic"). AReady=Truepoll is unsatisfiable as written — flow-11 only polls Ready after actually registering.It historically "passed" by accident: the gate grepped
Ready=Trueagainstsell statusoutput, which substring-matchesPaymentGateReady=Truewhenever the condition ladder converged inside the window.Fix: poll the serving condition set —
UpstreamHealthy+PaymentGateReady+RoutePublished, anchored greps viaobol kubectljsonpath — over 300s. That set is exactly what §3's 402 probe then exercises.flow-04 step 12 — window too tight, diagnostics swallowed
curl -sf --max-time 120failed reproducibly (twice, warm model) on the first inference ever routed through the Hermes agent pipeline: Hermes prepends a multi-thousand-token system prompt, and a local Ollama model pays full prompt processing before the KV cache warms (~150s observed; Hermes' internal client retries re-pay it until one attempt survives). The very next call answers in ~20s — which is why step 13 passed seconds later in every run. GPU-class endpoints converge in seconds and never see this.-falso swallowed the response entirely, so the fail line was the emptyAgent inference failed —(which initially sent the investigation toward cold model loads and auth). Fix: 300s window, no-f, and the fail message carries HTTP status + body snippet.Verification
UpstreamHealthy=True PaymentGateReady=True RoutePublished=True Registered=False Ready=False); the exact flow-04 request returns 200 with correct content once the prompt cache is warm.bash -nclean on both flows; with these fixes the rc14 wopus smoke is 12 PASS / 2 SKIP (the SKIPs are waived registration-receipt sub-checks, registrations themselves succeeded on-chain).