Skip to content

fix(connectivity): allow agentgateway data-plane egress to klaus-gateway:8080#193

Merged
teemow merged 1 commit into
mainfrom
fix/dataplane-to-klausgateway-egress
Jun 25, 2026
Merged

fix(connectivity): allow agentgateway data-plane egress to klaus-gateway:8080#193
teemow merged 1 commit into
mainfrom
fix/dataplane-to-klausgateway-egress

Conversation

@teemow

@teemow teemow commented Jun 25, 2026

Copy link
Copy Markdown
Member

Summary

End-to-end gazelle validation of the Slack OBO gateway surfaced a real routing bug: the bot never replied to DMs.

Root cause (from the agentgateway data-plane log):

route=agentic-platform/klausgateway endpoint=klaus-gateway...svc:8080
http.path=/channels/slack http.status=503
error="upstream call failed: Connect: deadline has elapsed" reason=UpstreamFailure

The channel paths (/v1, /web, /cli/v1, /channels/slack) are served on the agentgateway data-plane Gateway and forwarded to the klaus-gateway Service on :8080. The data-plane runs in Cilium default-deny egress (the -dataplane policy), whose cluster allowance only covers 80/443, and there was no per-backend egress allowance for klaus-gateway (unlike -dataplane-to-kagent). So the forward was dropped and every channel request — including inbound Slack events delivered to the public hostname and routed through the data-plane — failed with a 503, so the Slack bot never replied. The OBO route was unaffected because it goes envoy → klaus-gateway Service directly (a different, allowed hop).

Change

  • Add a *-dataplane-to-klausgateway egress policy (cilium + kubernetes flavors, rendered when klausGateway.agentgatewayRoute.enabled) that lets the data-plane reach klaus-gateway:8080. Mirrors the existing -dataplane-to-kagent allowance.

Test plan

  • CI green (test-ingress-modes, build-connectivity-chart, execute-chart-tests).
  • gazelle: after release + reconcile, POST /channels/slack/events no longer 503s; a Slack DM to the swarmgeist bot gets a reply.

Made with Cursor

…way:8080

The channel paths (/v1, /web, /cli/v1, /channels/slack) are served on the
agentgateway data-plane Gateway and forwarded to the klaus-gateway Service on
:8080. The data-plane runs in Cilium default-deny egress (the -dataplane policy),
whose cluster allowance only covers 80/443, and there was no per-backend egress
allowance for klaus-gateway (unlike -dataplane-to-kagent). So the forward was
dropped and every channel request -- including inbound Slack events delivered to
the public hostname and routed through the data-plane -- failed with
503 UpstreamFailure ("Connect: deadline has elapsed"), so the Slack bot never
replied.

Add a -dataplane-to-klausgateway egress policy (cilium + kubernetes flavors,
rendered when klausGateway.agentgatewayRoute.enabled) mirroring the existing
-dataplane-to-kagent allowance.

Co-authored-by: Cursor <cursoragent@cursor.com>
@teemow teemow requested a review from a team as a code owner June 25, 2026 18:13
@teemow teemow merged commit 4147eda into main Jun 25, 2026
8 checks passed
@teemow teemow deleted the fix/dataplane-to-klausgateway-egress branch June 25, 2026 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant