Inital prompt time increase#1528
Conversation
Signed-off-by: red-hat-konflux <[email protected]>
Signed-off-by: red-hat-konflux <[email protected]>
Signed-off-by: red-hat-konflux <[email protected]>
Signed-off-by: red-hat-konflux <[email protected]>
Signed-off-by: red-hat-konflux <[email protected]>
Signed-off-by: red-hat-konflux <[email protected]>
Creates kustomize overlay for deploying to hcmais01ue1 via app-interface: - Uses Konflux images from redhat-services-prod/hcm-eng-prod-tenant - Scales down in-cluster databases (using external RDS from app-interface Phase 2) - Scales down MinIO (using external S3 from app-interface Phase 2) - Includes CRDs, RBAC, routes, and all application components - Patches operator to use Konflux runner image Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Convert kustomize overlay to OpenShift Template format for app-interface SaaS deployment. Split into two templates: 1. template-operator.yaml (CRDs, ClusterRoles, operator deployment) - Operator and ambient-runner images - Cluster-scoped resources (CRDs, RBAC) - Operator deployment and its ConfigMaps 2. template-services.yaml (Application services) - Backend, frontend, public-api, ambient-api-server images - All deployments, services, routes, configmaps - Scales in-cluster services to 0 (minio, postgresql, unleash) Both templates use IMAGE_TAG parameter (auto-generated from git commit SHA) and support Konflux image gating through app-interface. This allows app-interface to use provider: openshift-template with proper parameter substitution instead of the directory provider which doesn't run kustomize build.
Creates kustomize overlay for deploying to hcmais01ue1 via app-interface: - Uses Konflux images from redhat-services-prod/hcm-eng-prod-tenant - Scales down in-cluster databases (using external RDS from app-interface Phase 2) - Scales down MinIO (using external S3 from app-interface Phase 2) - Includes CRDs, RBAC, routes, and all application components - Patches operator to use Konflux runner image Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
The objects field must be a YAML array with proper list indicators. Previous version was missing the '-' prefix on array items, causing: 'unable to decode STDIN: json: cannot unmarshal object into Go struct field Template.objects of type []runtime.RawExtension' Changes: - Rebuild templates using Python yaml library for correct formatting - Objects now properly formatted as YAML array with '- apiVersion:' - Add validate.sh script for testing with oc process - Both templates validated successfully Generated from kustomize overlay output with proper YAML structure.
Remove minio, postgresql, unleash, ambient-api-server-db. Using external RDS and S3 from app-interface. Removed 12 resources (4 Deployments, 4 Services, 3 PVCs, 1 Secret) Remaining: ambient-api-server, backend-api, frontend, public-api
Disables OTEL metrics export by commenting out OTEL_EXPORTER_OTLP_ENDPOINT environment variable in operator deployment manifests. The operator was configured to send metrics to otel-collector.ambient-code.svc:4317, but this service does not exist in the cluster, causing repeated gRPC connection failures every 30 seconds with error: "failed to upload metrics: context deadline exceeded: rpc error: code = Unavailable desc = name resolver error: produced zero addresses" With OTEL_EXPORTER_OTLP_ENDPOINT unset, InitMetrics() will skip metrics export and log "metrics export disabled" instead of throwing connection errors. Changes: - Comment out OTEL_EXPORTER_OTLP_ENDPOINT in base operator deployment - Comment out OTEL_EXPORTER_OTLP_ENDPOINT in OpenShift template - Add clarifying comment about re-enabling when collector is deployed 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Changes: - Add oauth-proxy component to frontend deployment (dashboard-ui port on 8443) - Enable SSL for ambient-api-server RDS connection (db-sslmode=require) - Set AMBIENT_ENV to 'stage' for ambient-api-server - Enable OpenShift service-ca for ambient-api-server TLS cert provisioning - Regenerate templates with new oauth-proxy and api-server patches This enables: - Authenticated access to frontend via OpenShift OAuth - Secure connections to external RDS database - Automatic TLS certificate rotation for ambient-api-server Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Remove postgresql, minio, unleash, and ambient-api-server-db resources from the services template. These services are scaled to 0 via kustomize patches because we use external RDS and S3 instead. Including them in the template causes app-interface to try deploying them, which fails imagePattern validation and wastes resources. Excluded resources: - Deployment/postgresql, Service/postgresql - Deployment/minio, Service/minio, PVC/minio-data - Deployment/unleash, Service/unleash - Deployment/ambient-api-server-db, Service/ambient-api-server-db Template now has 21 service resources (down from 30). Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Switch from custom vault secrets to OpenShift service account-based OAuth: - Use Red Hat's official ose-oauth-proxy-rhel9 image - Use service account token for cookie secret (no vault needed) - Enable HTTPS on OAuth proxy with OpenShift service-ca auto-generated certs - Add system:auth-delegator ClusterRoleBinding for OAuth delegation - Add OAuth redirect reference annotation to frontend ServiceAccount - Fix service account reference from 'nginx' to 'frontend' - Add missing NAMESPACE and UPSTREAM_TIMEOUT parameters Benefits: - No manual vault secret management - Automatic TLS cert rotation via service-ca - Standard OpenShift OAuth integration pattern - Follows app-interface team recommendations Files changed: - frontend-rbac.yaml: Added OAuth annotations and auth-delegator binding - oauth-proxy component patches: Updated to new configuration - Templates: Regenerated with OAuth fixes (27 operator, 21 service resources) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
The RDS credentials secret should not be in the OpenShift template - it's
provided by the external resource provider (terraform) in app-interface.
The namespace's externalResources section already defines:
- provider: rds
output_resource_name: ambient-code-rds
This automatically creates the secret with the correct RDS credentials.
Including the secret in the template with VAULT_INJECTED placeholders
caused deployment failures.
Changes:
- Excluded ambient-code-rds secret from template generation
- Template now has 20 service resources (down from 21)
- Deployment still references the secret via volumeMount (correct)
Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Signed-off-by: Chris Mitchell <[email protected]>
Signed-off-by: Chris Mitchell <[email protected]>
Changes GCP service account configuration to align with app-interface deployment where credentials are provided via Vault. Changes: - template-services.yaml: Update backend vertex-credentials secret name from 'ambient-vertex' to 'stage-gcp-creds' (matches Vault secret) - template-operator.yaml: Update GOOGLE_APPLICATION_CREDENTIALS path to match Vault secret key name 'itpc-gcp-hcm-pe-eng.json' The secret is provided by app-interface via: path: engineering-productivity/ambient-code/stage-gcp-creds This allows the backend and operator to use Vertex AI for Claude and Gemini API calls with the service account configured with roles/aiplatform.user permissions. Co-Authored-By: Claude Sonnet 4.5 <[email protected]> Signed-off-by: Chris Mitchell <[email protected]>
Configure OAuth proxy sidecar to inject authentication token into forwarded requests, fixing 401 errors on /api/projects endpoints. Changes: - Add --pass-access-token=true flag to inject X-Forwarded-Access-Token header - Change upstream from frontend-service:3000 to localhost:3000 (correct sidecar pattern) - Remove --request-logging to reduce log noise Backend logs showed: tokenSource=none hasAuthHeader=false hasFwdToken=false The backend expects the X-Forwarded-Access-Token header, which is now injected by the OAuth proxy for all authenticated requests. Flow: 1. User authenticates via OpenShift OAuth ✓ 2. OAuth proxy injects token header ✓ (new) 3. Frontend forwards token to backend API ✓ (fixed) This resolves the 401 authentication errors while maintaining the working OpenShift OAuth integration. Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Removed the '--set-authorization-header=true' option from the configuration.
Changes: - Use proper 32-byte cookie secret from Vault instead of service account token - Add --pass-access-token to forward user's OAuth token to upstream - Add --scope=user:full to request full user permissions - Mount stage-cookie-secret at /etc/oauth-cookie Problem: OAuth proxy was authenticating users but not forwarding tokens to the Next.js frontend. When frontend made backend API calls, it had no token to forward, resulting in 401 errors. Root cause: The service account token (1618 bytes) is too large for AES cipher when --pass-access-token is enabled, which requires 16/24/32 byte secrets. Solution: Use a proper 32-byte cookie secret from Vault (matching UAT config), enabling --pass-access-token to forward the authenticated user's token through the chain: OAuth proxy → Next.js → Backend. Co-Authored-By: Claude Sonnet 4.5 <[email protected]> Signed-off-by: Chris Mitchell <[email protected]>
Removed the '--scope=user:full' option from the configuration.
Signed-off-by: Chris Mitchell <[email protected]>
chore: Update konflux deps
Switch OAuth proxy from service account authentication to explicit SSO client credentials to enable user:full scope. Changes: - Replace --openshift-service-account with --client-id=ambient-code - Mount client_secret from stage-sso-client Kubernetes secret - Add --scope=user:full to grant full user permissions - Mount /etc/oauth-client volume for client secret file This allows users to create resources (AgenticSessions, ConfigMaps) in their project namespaces by providing the necessary OAuth scope. Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Remove ambient-frontend-oauth-delegator ClusterRoleBinding from the operator template as it is now deployed via app-interface openshiftResources for better separation of concerns. Cluster-scoped resources should be managed outside of saas file deployments as they have impact on the whole cluster. This ClusterRoleBinding grants the frontend service account the system:auth-delegator role needed for OAuth proxy token delegation. It is now defined in app-interface at: resources/services/ambient-code-platform/ambient-frontend-oauth-delegator.clusterrolebinding.yaml Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Oauth client updates
The pathChanged() CEL function was using incorrect glob syntax that prevented pipelines from triggering on component changes: - Changed `./components/*/***` to `components/*/**` (removed leading `./` and fixed triple-asterisk to double-asterisk for recursive matching) - Removed invalid root `Dockerfile` check (Dockerfiles are in component subdirectories, already covered by component globs) PipelinesAsCode pathChanged() expects standard glob patterns relative to repository root, with `**` for recursive directory matching. Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
fix(ci): correct Tekton pathChanged glob patterns
When OTEL_EXPORTER_OTLP_ENDPOINT is unset, InitMetrics() was returning early without initializing metric instruments, leaving them as nil. This caused nil pointer panics when reconciliation code called metric recording functions like RecordSessionCreatedByUser(). The panic occurred at otel_metrics.go:424 when sessionsByUser.Add() was called on a nil counter during reconcilePending phase. Fix: - When OTEL endpoint is unset, initialize no-op meter from global provider - Create all metric instruments as no-ops (silently ignore all calls) - Prevents nil pointer panics while maintaining same API contract - No-op instruments have all the same methods but do nothing OpenTelemetry provides a built-in no-op MeterProvider as the global default, which creates no-op instruments that safely ignore all metric recording calls without panicking. Error before fix: panic: runtime error: invalid memory address or nil pointer dereference at RecordSessionCreatedByUser (/app/internal/controller/otel_metrics.go:424) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
fix: initialize no-op metrics instruments when OTEL is disabled
Add permissions for mlflow.kubeflow.org Experiments and Runs CRDs to
the agentic-operator ClusterRole. The operator unconditionally grants
these permissions to session runner service accounts via Roles, but
cannot grant permissions it doesn't hold itself.
Without these ClusterRole permissions, session creation fails with:
user "system:serviceaccount:ambient-code:agentic-operator" is attempting
to grant RBAC permissions not currently held:
{APIGroups:["mlflow.kubeflow.org"], Resources:["experiments"], Verbs:[...]}
These are namespace-scoped CRDs from the Kubeflow MLflow Operator, used
for ML experiment tracking with Kubernetes-native RBAC authentication.
Sessions use these to log ML training runs, parameters, and metrics to
the MLflow tracking server.
Note: MLflow tracing is optional (MLFLOW_TRACING_ENABLED env var), but
the operator code unconditionally includes these permissions in session
Roles regardless of whether tracing is enabled.
Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
fix: add MLflow CRD permissions to operator ClusterRole
Add mlflow.kubeflow.org CRD permissions to the agentic-operator ClusterRole. The operator creates Roles in user namespaces that include MLflow permissions, but due to Kubernetes RBAC privilege escalation protection, it can only grant permissions it holds itself. Previous commit 2af8216 added MLflow permissions to backend-api ClusterRole, but missed adding them to agentic-operator. This causes session creation to fail with: user "system:serviceaccount:ambient-code:agentic-operator" is attempting to grant RBAC permissions not currently held: {APIGroups:["mlflow.kubeflow.org"], Resources:["experiments"], Verbs:[...]} The agentic-operator service account needs these permissions to create session runner Roles that include MLflow access. Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
…sterrole fix: add MLflow permissions to agentic-operator ClusterRole
The operator needs to create NetworkPolicies in user namespaces to isolate runner pods. Without this permission, session creation fails with: networkpolicies.networking.k8s.io is forbidden: User "system:serviceaccount:ambient-code:agentic-operator" cannot create resource "networkpolicies" in API group "networking.k8s.io" in the namespace "mknop-ws" This adds create/delete/get/list permissions for NetworkPolicies to the agentic-operator ClusterRole. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Configure oauth-proxy to route /api/* requests to backend-service instead of the Next.js frontend. Without this routing, all requests including /api/* go to localhost:3000, causing 503 errors because Next.js doesn't handle backend API routes. Changes: - Add --upstream=http://backend-service:8080/api/ before default upstream - Requests to /api/* now route to backend-service:8080 - All other requests continue to Next.js frontend at localhost:3000 OAuth2-proxy processes upstreams in order and uses the path portion as a matching key. The /api/ path in the upstream URL matches any request starting with /api/, and the full request path is forwarded to the backend. Request flow example: Browser: GET https://ambient.corp.stage.redhat.com/api/projects/foo/sessions/bar → OAuth-proxy checks auth via --openshift-delegate-urls → Matches --upstream=http://backend-service:8080/api/ (longest match) → Forwards to: http://backend-service:8080/api/projects/foo/sessions/bar Fixes browser console errors: GET /api/projects/.../git/status [503 Service Unavailable] AG-UI stream error: Connection error The connection to .../agui/events was interrupted Co-Authored-By: Claude Sonnet 4.5 <[email protected]> Signed-off-by: Chris Mitchell <[email protected]>
fix: add backend API routing to oauth-proxy upstream
Remove --openshift-delegate-urls parameter from oauth-proxy that was
blocking /api/* requests with "no resource mapped path" errors.
Issue:
- openshift-delegate-urls={"/api":{"resource":"projects","verb":"list"}}
only matches /api exactly, not /api/* subpaths
- All /api/* requests were returning 503 even though backend received
and processed them successfully (200 OK in backend logs)
- oauth-proxy logs showed: "no resource mapped path"
Solution:
OAuth-proxy still provides authentication (OAuth login required for all
requests) and passes the access token to the backend via --pass-access-token.
The backend handles its own fine-grained authorization based on the token,
so the blanket openshift-delegate-urls check is redundant and overly
restrictive.
Authorization flow after this change:
1. User authenticates via OAuth (enforced by oauth-proxy)
2. oauth-proxy passes access token to backend
3. Backend validates token and checks user permissions per endpoint
4. Backend returns appropriate response (200, 403, 404, etc.)
This matches the backend's existing authorization model where different
API endpoints have different permission requirements that can't be
expressed in a single openshift-delegate-urls pattern.
Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
…urls fix: remove overly restrictive openshift-delegate-urls check
✅ Deploy Preview for cheerful-kitten-f556a0 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
📝 WalkthroughWalkthroughThis PR adds Tekton-based CI/CD pipeline definitions for building container images across five components, renames the API server database secret from ChangesTekton CI/CD Pipelines
Ambient Code Deployment Configuration
🚥 Pre-merge checks | ✅ 6 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (6 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
components/manifests/base/platform/ambient-api-server-secrets.yml (1)
10-15:⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy liftRemove plaintext DB credentials from versioned manifests
stringData.db.passwordon Line 15 is a hardcoded secret in Git. This is a direct credential exposure risk and should be replaced with external secret injection (e.g., ExternalSecret/SealedSecret or CI-provisioned Secret) plus credential rotation.Suggested direction
stringData: db.host: ambient-api-server-db db.port: "5432" db.name: ambient_api_server db.user: ambient - db.password: TheBlurstOfTimes + # db.password intentionally omitted; populate from external secret manager / sealed secret in env overlays🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@components/manifests/base/platform/ambient-api-server-secrets.yml` around lines 10 - 15, Remove the hardcoded plaintext value in stringData.db.password and replace it with an injected secret reference or placeholder; specifically, delete the literal "TheBlurstOfTimes" under stringData -> db.password and wire this manifest to consume a runtime-provisioned Secret (e.g., use ExternalSecret/SealedSecret or CI-created Kubernetes Secret) so the deployment reads db.password from the secret key (not checked into Git); update any related keys (db.user, db.host, db.port, db.name) to either reference non-sensitive placeholders or the same secret mechanism, and document creating the external secret (with rotation policy) that provides the db.password key expected by this manifest.components/manifests/components/oauth-proxy/frontend-oauth-deployment-patch.yaml (1)
12-73:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDo not leave the frontend pod on the default container security context.
This patch still leaves the
frontendandoauth-proxycontainers without an explicitsecurityContext, so they inherit platform defaults for root/privilege behavior. Checkov/Trivy are already flagging this; please set container hardening explicitly (allowPrivilegeEscalation: false, dropALL,runAsNonRoot, andseccompProfile, plusreadOnlyRootFilesystemwhere the image supports it).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@components/manifests/components/oauth-proxy/frontend-oauth-deployment-patch.yaml` around lines 12 - 73, The frontend and oauth-proxy containers lack explicit container securityContext so they inherit platform defaults; add a securityContext block to both container specs (containers named "frontend" and "oauth-proxy") with allowPrivilegeEscalation: false, runAsNonRoot: true, seccompProfile (type: RuntimeDefault), securityContext.capabilities.drop: ["ALL"], and readOnlyRootFilesystem: true where the image supports it (omit or set false for oauth-proxy if it requires writes); ensure these fields are under each container (not podSecurityContext) and adjust any volume mounts/paths that require write access or set runAsUser if needed to satisfy runAsNonRoot..tekton/ambient-code-operator-main-push.yaml (1)
1-583:⚠️ Potential issue | 🟠 Major | ⚡ Quick winScope drift from the stated PR objective — please split this out
This file introduces a full new CI pipeline, but the PR objective is to only increase
INITIAL_PROMPT_DELAY_SECONDSfrom 2 to 10. Shipping unrelated Tekton pipeline changes in the same PR materially increases release and rollback risk for a user-facing runtime fix.Please move these pipeline additions to a separate PR and keep this PR narrowly scoped to the prompt-delay change.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.tekton/ambient-code-operator-main-push.yaml around lines 1 - 583, The PR accidentally adds a full Tekton PipelineRun (ambient-code-operator-main-on-push in .tekton/ambient-code-operator-main-push.yaml) which is unrelated to the intended change to INITIAL_PROMPT_DELAY_SECONDS; remove the pipeline addition from this branch (revert or delete the .tekton/ambient-code-operator-main-push.yaml changes) and create a separate branch/PR that contains the new PipelineRun and any related params (e.g., the pipeline metadata and params like output-image, git-url, etc.); ensure this PR only contains the single change that updates INITIAL_PROMPT_DELAY_SECONDS.
🧹 Nitpick comments (2)
components/operator/internal/handlers/sessions.go (1)
1114-1123: ⚡ Quick winMake
INITIAL_PROMPT_DELAY_SECONDSconfigurable instead of hardcoding"10".Bumping 2s → 10s unblocks the current hot path, but the next time service-readiness timing changes you'll be back here editing the operator. A small fallback chain (CRD spec → operator env → default 10) keeps this fix without baking the magic number in:
♻️ Proposed change
+ initialPromptDelay := "10" + if v, _, _ := unstructured.NestedInt64(spec, "initialPromptDelaySeconds"); v > 0 { + initialPromptDelay = fmt.Sprintf("%d", v) + } else if v := strings.TrimSpace(os.Getenv("INITIAL_PROMPT_DELAY_SECONDS")); v != "" { + initialPromptDelay = v + } + // Core session env vars base = append(base, corev1.EnvVar{Name: "INITIAL_PROMPT", Value: prompt}, - corev1.EnvVar{Name: "INITIAL_PROMPT_DELAY_SECONDS", Value: "10"}, + corev1.EnvVar{Name: "INITIAL_PROMPT_DELAY_SECONDS", Value: initialPromptDelay}, corev1.EnvVar{Name: "LLM_MODEL", Value: model},If the spec field is wired up, pair it with the CRD addition suggested in
template-operator.yaml.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@components/operator/internal/handlers/sessions.go` around lines 1114 - 1123, Replace the hardcoded "10" for INITIAL_PROMPT_DELAY_SECONDS with a value resolved from the session CRD spec, then the operator's environment, then defaulting to 10: read a field like Session.Spec.InitialPromptDelaySeconds (use a sensible name if different) when building the env slice in the code that constructs base (where corev1.EnvVar{Name: "INITIAL_PROMPT_DELAY_SECONDS", ...} is appended), fall back to an operator-level env var (e.g., OPERATOR_INITIAL_PROMPT_DELAY_SECONDS) if the CRD field is unset, and finally default to 10 seconds; update the CRD/template (template-operator.yaml) to add the new spec field so the value can be supplied from the CR.components/manifests/templates/template-operator.yaml (1)
33-119: ⚡ Quick winConsider exposing
initialPromptDelaySecondson the CRD spec.The runner is now driven by
INITIAL_PROMPT_DELAY_SECONDS(hardcoded to"10"incomponents/operator/internal/handlers/sessions.goline 1116). If different environments/clusters need different warm-up windows, surfacing a per-session optional field here would let operators tune without redeploying the operator image.♻️ Proposed CRD addition (optional, with sensible bounds)
initialPrompt: description: Initial prompt used only on first SDK invocation for brand new sessions (ignored on continuations or workflow restarts). type: string + initialPromptDelaySeconds: + default: 10 + description: Seconds to wait before sending the initial prompt to the SDK so dependent services have time to become ready. + minimum: 0 + type: integer🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@components/manifests/templates/template-operator.yaml` around lines 33 - 119, Add an optional integer field initialPromptDelaySeconds to the CRD spec (under spec -> properties, alongside initialPrompt) with a sensible default and bounds (e.g., default 10, minimum 0, maximum 300) so operators can tune per-session warm-up; then update the operator code that currently uses the hardcoded INITIAL_PROMPT_DELAY_SECONDS (referenced in components/operator/internal/handlers/sessions.go) to prefer the value from the session CRD (e.g., session.Spec.InitialPromptDelaySeconds) and fall back to the env var or default when the field is nil/zero. Ensure you update any JSON/YAML tags and validation schema for the field and the code path in sessions.go that computes the delay so it reads and respects the CRD value.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.tekton/ambient-code-ambient-runner-main-pull-request.yaml:
- Line 5: The AppStudio repo metadata hard-codes the repository URL to
RedHatInsights/ambient-code-platform, which can mismatch fork PRs. Update the
metadata key build.appstudio.openshift.io/repo to use the variable
{{source_url}} instead of the hard-coded URL, ensuring it dynamically reflects
the actual source repository used in the clone step.
In @.tekton/ambient-code-frontend-main-push.yaml:
- Around line 8-10: Update the Pipelines as Code annotation
pipelinesascode.tekton.dev/cancel-in-progress from "false" to "true" so
in-flight main push pipelines are cancelled instead of completing out-of-order;
apply this exact change in each of the six main-push manifests
(ambient-code-frontend-main-push.yaml, ambient-code-public-api-main-push.yaml,
ambient-code-operator-main-push.yaml, ambient-code-backend-main-push.yaml,
ambient-code-ambient-runner-main-push.yaml,
ambient-code-ambient-api-server-main-push.yaml) and ensure the other annotations
(pipelinesascode.tekton.dev/max-keep-runs and
pipelinesascode.tekton.dev/on-cel-expression) remain unchanged.
In `@components/ambient-api-server/templates/db-template.yml`:
- Line 17: The default DATABASE_SERVICE_NAME value "ambient-code-rds" can
collide with externally managed RDS secrets; change the default in
db-template.yml to a non-colliding name (e.g., "ambient-code-rds-local" or
"ambient-code-rds-template") and update all resource name templates that
reference DATABASE_SERVICE_NAME (Service, Deployment, PVC, Secret) to include a
template-specific suffix or chart identifier so they generate unique names (for
example append "-local" or the chart name to the value used for metadata names)
and ensure any secret names that previously used DATABASE_SERVICE_NAME are also
changed to the new pattern so external secrets aren’t accidentally consumed.
In
`@components/manifests/components/oauth-proxy/frontend-oauth-deployment-patch.yaml`:
- Around line 34-35: Replace use of the rotating service account token for
cookie signing by creating and mounting a stable Kubernetes Secret and pointing
the oauth-proxy flag --cookie-secret-file at the secret-backed file instead of
/var/run/secrets/kubernetes.io/serviceaccount/token; update the manifest lines
that currently set --cookie-secret-file and keep --upstream-timeout as-is, add a
volume and volumeMount referencing the new Secret (with the secret key
containing the cookie signing key) so the oauth-proxy process reads a persistent
signing key across restarts.
In `@components/manifests/overlays/app-interface/kustomization.yaml`:
- Around line 84-121: The overlay kustomization.yaml currently pins every image
with newTag: latest (e.g., entries for quay.io/ambient_code/vteam_operator,
vteam_backend, vteam_frontend, vteam_public_api, vteam_api_server,
vteam_claude_runner and their ":latest" duplicates); change each newTag from
"latest" to an immutable release tag or, preferably, a digest (sha256) for that
exact image (or replace newName with the full image@sha256:... form), ensuring
both the plain and ":latest" name overrides are updated or deduplicated so the
overlay references a reproducible, reviewable image per component.
In
`@components/manifests/overlays/app-interface/operator-runner-image-patch.yaml`:
- Around line 12-13: The environment variable AMBIENT_CODE_RUNNER_IMAGE is
pinned to the mutable tag ":latest" which makes deployments non-deterministic;
replace the value string in operator-runner-image-patch.yaml for
AMBIENT_CODE_RUNNER_IMAGE with an immutable reference (either a specific version
tag like ":v1.2.3" or a content-addressable digest "@" format such as
"@sha256:..."). Locate the AMBIENT_CODE_RUNNER_IMAGE entry and update its value
to the chosen immutable tag or digest, then verify image pull succeeds and
update any release notes/CI that publish or reference the runner image.
In
`@components/manifests/overlays/production/ambient-api-server-jwt-args-patch.yaml`:
- Around line 36-37: Replace the insecure flag `--db-sslmode=require` with a
verification mode and provide the RDS CA bundle: change `--db-sslmode=require`
to `--db-sslmode=verify-full` (or `verify-ca` if hostname verification is
undesirable) and add a `--db-sslrootcert=/path/to/rds-ca.pem` argument; also
mount the RDS CA (from the existing template parameter/secret/configmap) into
the container and point the `--db-sslrootcert` to that mounted file so the DB
client actually verifies the server certificate.
In
`@components/manifests/overlays/production/ambient-api-server-migration-ssl-patch.yaml`:
- Around line 9-21: The init container named "migration" is missing a resources
block and a securityContext; add a resources section on the migration
initContainer with required requests and limits for cpu and memory (both
requests and limits) per your guideline, and add a securityContext to the same
container (runAsNonRoot: true, privileged: false, allowPrivilegeEscalation:
false, readOnlyRootFilesystem: true, and drop all capabilities) so the init
container has CPU/memory guardrails and restricted privileges.
In `@components/manifests/templates/template-services.yaml`:
- Around line 304-328: The migration initContainer named "migration" is missing
resource requests/limits; add a small resources block (cpu and memory requests
and limits) to the migration initContainer and apply the same pattern to the
sibling init-db initContainers so they mirror each other; locate the
initContainer with name: migration and add a resources: section (requests:
cpu/memory and limits: cpu/memory) with conservative values appropriate for
short-lived migrations to ensure proper scheduling and eviction behavior.
---
Outside diff comments:
In @.tekton/ambient-code-operator-main-push.yaml:
- Around line 1-583: The PR accidentally adds a full Tekton PipelineRun
(ambient-code-operator-main-on-push in
.tekton/ambient-code-operator-main-push.yaml) which is unrelated to the intended
change to INITIAL_PROMPT_DELAY_SECONDS; remove the pipeline addition from this
branch (revert or delete the .tekton/ambient-code-operator-main-push.yaml
changes) and create a separate branch/PR that contains the new PipelineRun and
any related params (e.g., the pipeline metadata and params like output-image,
git-url, etc.); ensure this PR only contains the single change that updates
INITIAL_PROMPT_DELAY_SECONDS.
In `@components/manifests/base/platform/ambient-api-server-secrets.yml`:
- Around line 10-15: Remove the hardcoded plaintext value in
stringData.db.password and replace it with an injected secret reference or
placeholder; specifically, delete the literal "TheBlurstOfTimes" under
stringData -> db.password and wire this manifest to consume a
runtime-provisioned Secret (e.g., use ExternalSecret/SealedSecret or CI-created
Kubernetes Secret) so the deployment reads db.password from the secret key (not
checked into Git); update any related keys (db.user, db.host, db.port, db.name)
to either reference non-sensitive placeholders or the same secret mechanism, and
document creating the external secret (with rotation policy) that provides the
db.password key expected by this manifest.
In
`@components/manifests/components/oauth-proxy/frontend-oauth-deployment-patch.yaml`:
- Around line 12-73: The frontend and oauth-proxy containers lack explicit
container securityContext so they inherit platform defaults; add a
securityContext block to both container specs (containers named "frontend" and
"oauth-proxy") with allowPrivilegeEscalation: false, runAsNonRoot: true,
seccompProfile (type: RuntimeDefault), securityContext.capabilities.drop:
["ALL"], and readOnlyRootFilesystem: true where the image supports it (omit or
set false for oauth-proxy if it requires writes); ensure these fields are under
each container (not podSecurityContext) and adjust any volume mounts/paths that
require write access or set runAsUser if needed to satisfy runAsNonRoot.
---
Nitpick comments:
In `@components/manifests/templates/template-operator.yaml`:
- Around line 33-119: Add an optional integer field initialPromptDelaySeconds to
the CRD spec (under spec -> properties, alongside initialPrompt) with a sensible
default and bounds (e.g., default 10, minimum 0, maximum 300) so operators can
tune per-session warm-up; then update the operator code that currently uses the
hardcoded INITIAL_PROMPT_DELAY_SECONDS (referenced in
components/operator/internal/handlers/sessions.go) to prefer the value from the
session CRD (e.g., session.Spec.InitialPromptDelaySeconds) and fall back to the
env var or default when the field is nil/zero. Ensure you update any JSON/YAML
tags and validation schema for the field and the code path in sessions.go that
computes the delay so it reads and respects the CRD value.
In `@components/operator/internal/handlers/sessions.go`:
- Around line 1114-1123: Replace the hardcoded "10" for
INITIAL_PROMPT_DELAY_SECONDS with a value resolved from the session CRD spec,
then the operator's environment, then defaulting to 10: read a field like
Session.Spec.InitialPromptDelaySeconds (use a sensible name if different) when
building the env slice in the code that constructs base (where
corev1.EnvVar{Name: "INITIAL_PROMPT_DELAY_SECONDS", ...} is appended), fall back
to an operator-level env var (e.g., OPERATOR_INITIAL_PROMPT_DELAY_SECONDS) if
the CRD field is unset, and finally default to 10 seconds; update the
CRD/template (template-operator.yaml) to add the new spec field so the value can
be supplied from the CR.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: f04e4866-94ef-4c44-a7b2-f43dbce5265c
📒 Files selected for processing (50)
.tekton/ambient-code-ambient-api-server-main-pull-request.yaml.tekton/ambient-code-ambient-api-server-main-push.yaml.tekton/ambient-code-ambient-runner-main-pull-request.yaml.tekton/ambient-code-ambient-runner-main-push.yaml.tekton/ambient-code-backend-main-pull-request.yaml.tekton/ambient-code-backend-main-push.yaml.tekton/ambient-code-frontend-main-pull-request.yaml.tekton/ambient-code-frontend-main-push.yaml.tekton/ambient-code-operator-main-pull-request.yaml.tekton/ambient-code-operator-main-push.yaml.tekton/ambient-code-public-api-main-pull-request.yaml.tekton/ambient-code-public-api-main-push.yamlcomponents/ambient-api-server/templates/db-template.ymlcomponents/manifests/README.mdcomponents/manifests/base/core/ambient-api-server-service.ymlcomponents/manifests/base/core/operator-deployment.yamlcomponents/manifests/base/platform/ambient-api-server-db.ymlcomponents/manifests/base/platform/ambient-api-server-secrets.ymlcomponents/manifests/base/rbac/frontend-rbac.yamlcomponents/manifests/components/ambient-api-server-db/ambient-api-server-db-json-patch.yamlcomponents/manifests/components/ambient-api-server-db/ambient-api-server-init-db-patch.yamlcomponents/manifests/components/ambient-api-server-db/kustomization.yamlcomponents/manifests/components/oauth-proxy/frontend-oauth-deployment-patch.yamlcomponents/manifests/components/oauth-proxy/frontend-oauth-service-patch.yamlcomponents/manifests/overlays/app-interface/ambient-api-server-db-secret-patch.yamlcomponents/manifests/overlays/app-interface/ambient-api-server-env-patch.yamlcomponents/manifests/overlays/app-interface/ambient-api-server-route.yamlcomponents/manifests/overlays/app-interface/ambient-api-server-service-ca-patch.yamlcomponents/manifests/overlays/app-interface/ambient-api-server-ssl-patch.yamlcomponents/manifests/overlays/app-interface/backend-route.yamlcomponents/manifests/overlays/app-interface/kustomization.yamlcomponents/manifests/overlays/app-interface/namespace-patch.yamlcomponents/manifests/overlays/app-interface/namespace.yamlcomponents/manifests/overlays/app-interface/operator-config-openshift.yamlcomponents/manifests/overlays/app-interface/operator-runner-image-patch.yamlcomponents/manifests/overlays/app-interface/public-api-route.yamlcomponents/manifests/overlays/app-interface/route.yamlcomponents/manifests/overlays/kind/api-server-db-security-patch.yamlcomponents/manifests/overlays/kind/api-server-no-jwt-patch.yamlcomponents/manifests/overlays/local-dev/ambient-api-server-db-credentials-patch.yamlcomponents/manifests/overlays/local-dev/ambient-api-server-db-json-patch.yamlcomponents/manifests/overlays/local-dev/ambient-api-server-init-db-patch.yamlcomponents/manifests/overlays/production/ambient-api-server-jwt-args-patch.yamlcomponents/manifests/overlays/production/ambient-api-server-migration-ssl-patch.yamlcomponents/manifests/overlays/production/kustomization.yamlcomponents/manifests/templates/template-operator.yamlcomponents/manifests/templates/template-services.yamlcomponents/manifests/templates/validate.shcomponents/operator/internal/controller/otel_metrics.gocomponents/operator/internal/handlers/sessions.go
| kind: PipelineRun | ||
| metadata: | ||
| annotations: | ||
| build.appstudio.openshift.io/repo: https://github.com/RedHatInsights/ambient-code-platform?rev={{revision}} |
There was a problem hiding this comment.
Use the actual source URL in AppStudio repo metadata.
This hard-codes RedHatInsights/ambient-code-platform, but this PR is in ambient-code/platform and the clone step already uses {{source_url}}. That can stamp the wrong repo into provenance/UI metadata, especially for fork PRs.
Proposed fix
- build.appstudio.openshift.io/repo: https://github.com/RedHatInsights/ambient-code-platform?rev={{revision}}
+ build.appstudio.openshift.io/repo: '{{source_url}}?rev={{revision}}'📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| build.appstudio.openshift.io/repo: https://github.com/RedHatInsights/ambient-code-platform?rev={{revision}} | |
| build.appstudio.openshift.io/repo: '{{source_url}}?rev={{revision}}' |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.tekton/ambient-code-ambient-runner-main-pull-request.yaml at line 5, The
AppStudio repo metadata hard-codes the repository URL to
RedHatInsights/ambient-code-platform, which can mismatch fork PRs. Update the
metadata key build.appstudio.openshift.io/repo to use the variable
{{source_url}} instead of the hard-coded URL, ensuring it dynamically reflects
the actual source repository used in the clone step.
| pipelinesascode.tekton.dev/cancel-in-progress: "false" | ||
| pipelinesascode.tekton.dev/max-keep-runs: "3" | ||
| pipelinesascode.tekton.dev/on-cel-expression: event == "push" && target_branch |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check if the file exists and view the specific lines
fd -t f '*ambient-code-frontend-main-push.yaml'Repository: ambient-code/platform
Length of output: 382
🏁 Script executed:
# Once we find the file, let's examine lines 1-30 to see the cancel-in-progress setting
cat -n .tekton/ambient-code-frontend-main-push.yaml | head -30Repository: ambient-code/platform
Length of output: 1528
🏁 Script executed:
# Search for apply-tags task in the file
rg -n "apply-tags" .tekton/ambient-code-frontend-main-push.yamlRepository: ambient-code/platform
Length of output: 271
🏁 Script executed:
# Find all *-main-push.yaml files to check the pattern
fd -t f '*-main-push.yaml'Repository: ambient-code/platform
Length of output: 361
🏁 Script executed:
# Find all *-main-push.yaml files using glob pattern
fd --glob '*-main-push.yaml'Repository: ambient-code/platform
Length of output: 47
🏁 Script executed:
# Check the apply-tags task context - view more lines around line 509
cat -n .tekton/ambient-code-frontend-main-push.yaml | sed -n '500,530p'Repository: ambient-code/platform
Length of output: 1330
🏁 Script executed:
# Check for cancel-in-progress settings in all files
rg 'cancel-in-progress' .tekton/Repository: ambient-code/platform
Length of output: 1389
🏁 Script executed:
# Check for apply-tags in all main-push.yaml files
rg 'apply-tags' .tekton/ --glob '*-main-push.yaml'Repository: ambient-code/platform
Length of output: 2090
🏁 Script executed:
# Get a count of how many tasks each main-push pipeline has
for file in .tekton/*-main-push.yaml; do
count=$(rg -c '^\s+- name:' "$file" | cut -d: -f2)
echo "$file: $count tasks"
doneRepository: ambient-code/platform
Length of output: 418
Enable pipeline cancellation for concurrent main pushes across all affected files.
With cancel-in-progress: "false", concurrent main builds can complete out of order and overwrite tags via the apply-tags task, repointing shared tags to older digests. This affects 6 files: ambient-code-frontend-main-push.yaml, ambient-code-public-api-main-push.yaml, ambient-code-operator-main-push.yaml, ambient-code-backend-main-push.yaml, ambient-code-ambient-runner-main-push.yaml, and ambient-code-ambient-api-server-main-push.yaml.
Proposed fix
- pipelinesascode.tekton.dev/cancel-in-progress: "false"
+ pipelinesascode.tekton.dev/cancel-in-progress: "true"Apply this change to all 6 main-push.yaml files listed above.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| pipelinesascode.tekton.dev/cancel-in-progress: "false" | |
| pipelinesascode.tekton.dev/max-keep-runs: "3" | |
| pipelinesascode.tekton.dev/on-cel-expression: event == "push" && target_branch | |
| pipelinesascode.tekton.dev/cancel-in-progress: "true" | |
| pipelinesascode.tekton.dev/max-keep-runs: "3" | |
| pipelinesascode.tekton.dev/on-cel-expression: event == "push" && target_branch |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.tekton/ambient-code-frontend-main-push.yaml around lines 8 - 10, Update the
Pipelines as Code annotation pipelinesascode.tekton.dev/cancel-in-progress from
"false" to "true" so in-flight main push pipelines are cancelled instead of
completing out-of-order; apply this exact change in each of the six main-push
manifests (ambient-code-frontend-main-push.yaml,
ambient-code-public-api-main-push.yaml, ambient-code-operator-main-push.yaml,
ambient-code-backend-main-push.yaml, ambient-code-ambient-runner-main-push.yaml,
ambient-code-ambient-api-server-main-push.yaml) and ensure the other annotations
(pipelinesascode.tekton.dev/max-keep-runs and
pipelinesascode.tekton.dev/on-cel-expression) remain unchanged.
| displayName: Database Service Name | ||
| required: true | ||
| value: ambient-api-server-db | ||
| value: ambient-code-rds |
There was a problem hiding this comment.
Default DB template name now risks colliding with external RDS secret naming.
Line 17 sets DATABASE_SERVICE_NAME to ambient-code-rds, and this template reuses that value for Service/Deployment/PVC/Secret names. That can conflict with externally managed ambient-code-rds credentials and cause wrong DB host/credentials to be consumed.
Proposed fix
- value: ambient-code-rds
+ value: ambient-api-server-db📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| value: ambient-code-rds | |
| value: ambient-api-server-db |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@components/ambient-api-server/templates/db-template.yml` at line 17, The
default DATABASE_SERVICE_NAME value "ambient-code-rds" can collide with
externally managed RDS secrets; change the default in db-template.yml to a
non-colliding name (e.g., "ambient-code-rds-local" or
"ambient-code-rds-template") and update all resource name templates that
reference DATABASE_SERVICE_NAME (Service, Deployment, PVC, Secret) to include a
template-specific suffix or chart identifier so they generate unique names (for
example append "-local" or the chart name to the value used for metadata names)
and ensure any secret names that previously used DATABASE_SERVICE_NAME are also
changed to the new pattern so external secrets aren’t accidentally consumed.
| - --cookie-secret-file=/var/run/secrets/kubernetes.io/serviceaccount/token | ||
| - --upstream-timeout=5m |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -type f -name "frontend-oauth-deployment-patch.yaml" -o -name "template-services.yaml" | head -20Repository: ambient-code/platform
Length of output: 204
🏁 Script executed:
# Let's also check the structure of oauth-proxy related files
find . -type d -name "oauth-proxy" 2>/dev/nullRepository: ambient-code/platform
Length of output: 111
🏁 Script executed:
# Search for oauth-proxy configurations and references
rg "cookie-secret" --type yaml -A 2 -B 2Repository: ambient-code/platform
Length of output: 2838
🏁 Script executed:
# Look for any oauth-proxy documentation or setup patterns in the codebase
rg "oauth.*secret" -i --type yamlRepository: ambient-code/platform
Length of output: 2097
🏁 Script executed:
cat -n ./components/manifests/components/oauth-proxy/frontend-oauth-deployment-patch.yamlRepository: ambient-code/platform
Length of output: 3036
🏁 Script executed:
cat -n ./components/manifests/templates/template-services.yaml | head -100Repository: ambient-code/platform
Length of output: 3299
🏁 Script executed:
cat -n ./components/manifests/overlays/production/frontend-oauth-patch.yamlRepository: ambient-code/platform
Length of output: 2660
Use a dedicated Secret for cookie signing instead of the service account token.
The service account token at /var/run/secrets/kubernetes.io/serviceaccount/token rotates on pod/token updates, changing the cookie secret and invalidating active sessions. OAuth-proxy requires a stable signing key across restarts. Adopt the dedicated Secret pattern used in overlays/production/frontend-oauth-patch.yaml and template-services.yaml, which mount a persistent Secret volume.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@components/manifests/components/oauth-proxy/frontend-oauth-deployment-patch.yaml`
around lines 34 - 35, Replace use of the rotating service account token for
cookie signing by creating and mounting a stable Kubernetes Secret and pointing
the oauth-proxy flag --cookie-secret-file at the secret-backed file instead of
/var/run/secrets/kubernetes.io/serviceaccount/token; update the manifest lines
that currently set --cookie-secret-file and keep --upstream-timeout as-is, add a
volume and volumeMount referencing the new Secret (with the secret key
containing the cookie signing key) so the oauth-proxy process reads a persistent
signing key across restarts.
| # Konflux image overrides (redhat-services-prod) | ||
| images: | ||
| - name: quay.io/ambient_code/vteam_operator | ||
| newName: quay.io/redhat-services-prod/hcm-eng-prod-tenant/ambient-code-main/ambient-code-operator-main | ||
| newTag: latest | ||
| - name: quay.io/ambient_code/vteam_operator:latest | ||
| newName: quay.io/redhat-services-prod/hcm-eng-prod-tenant/ambient-code-main/ambient-code-operator-main | ||
| newTag: latest | ||
| - name: quay.io/ambient_code/vteam_backend | ||
| newName: quay.io/redhat-services-prod/hcm-eng-prod-tenant/ambient-code-main/ambient-code-backend-main | ||
| newTag: latest | ||
| - name: quay.io/ambient_code/vteam_backend:latest | ||
| newName: quay.io/redhat-services-prod/hcm-eng-prod-tenant/ambient-code-main/ambient-code-backend-main | ||
| newTag: latest | ||
| - name: quay.io/ambient_code/vteam_frontend | ||
| newName: quay.io/redhat-services-prod/hcm-eng-prod-tenant/ambient-code-main/ambient-code-frontend-main | ||
| newTag: latest | ||
| - name: quay.io/ambient_code/vteam_frontend:latest | ||
| newName: quay.io/redhat-services-prod/hcm-eng-prod-tenant/ambient-code-main/ambient-code-frontend-main | ||
| newTag: latest | ||
| - name: quay.io/ambient_code/vteam_public_api | ||
| newName: quay.io/redhat-services-prod/hcm-eng-prod-tenant/ambient-code-main/ambient-code-public-api-main | ||
| newTag: latest | ||
| - name: quay.io/ambient_code/vteam_public_api:latest | ||
| newName: quay.io/redhat-services-prod/hcm-eng-prod-tenant/ambient-code-main/ambient-code-public-api-main | ||
| newTag: latest | ||
| - name: quay.io/ambient_code/vteam_api_server | ||
| newName: quay.io/redhat-services-prod/hcm-eng-prod-tenant/ambient-code-main/ambient-code-ambient-api-server-main | ||
| newTag: latest | ||
| - name: quay.io/ambient_code/vteam_api_server:latest | ||
| newName: quay.io/redhat-services-prod/hcm-eng-prod-tenant/ambient-code-main/ambient-code-ambient-api-server-main | ||
| newTag: latest | ||
| - name: quay.io/ambient_code/vteam_claude_runner | ||
| newName: quay.io/redhat-services-prod/hcm-eng-prod-tenant/ambient-code-main/ambient-code-ambient-runner-main | ||
| newTag: latest | ||
| - name: quay.io/ambient_code/vteam_claude_runner:latest | ||
| newName: quay.io/redhat-services-prod/hcm-eng-prod-tenant/ambient-code-main/ambient-code-ambient-runner-main | ||
| newTag: latest |
There was a problem hiding this comment.
Pin immutable image tags in this overlay.
Every override here uses newTag: latest, which makes the app-interface deployment non-reproducible and allows unreviewed image drift without any manifest change. Please pin a release tag or digest per component instead.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@components/manifests/overlays/app-interface/kustomization.yaml` around lines
84 - 121, The overlay kustomization.yaml currently pins every image with newTag:
latest (e.g., entries for quay.io/ambient_code/vteam_operator, vteam_backend,
vteam_frontend, vteam_public_api, vteam_api_server, vteam_claude_runner and
their ":latest" duplicates); change each newTag from "latest" to an immutable
release tag or, preferably, a digest (sha256) for that exact image (or replace
newName with the full image@sha256:... form), ensuring both the plain and
":latest" name overrides are updated or deduplicated so the overlay references a
reproducible, reviewable image per component.
| - name: AMBIENT_CODE_RUNNER_IMAGE | ||
| value: quay.io/redhat-services-prod/hcm-eng-prod-tenant/ambient-code-main/ambient-code-ambient-runner-main:latest |
There was a problem hiding this comment.
Avoid mutable :latest for runner image.
Line 13 uses a mutable tag, which makes runner behavior non-deterministic across deploys and weakens rollback/auditability. Pin to an immutable version tag or digest.
Proposed fix
- value: quay.io/redhat-services-prod/hcm-eng-prod-tenant/ambient-code-main/ambient-code-ambient-runner-main:latest
+ value: quay.io/redhat-services-prod/hcm-eng-prod-tenant/ambient-code-main/ambient-code-ambient-runner-main@sha256:<pinned-digest>🧰 Tools
🪛 Checkov (3.2.526)
[medium] 2-13: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 2-13: Minimize the admission of root containers
(CKV_K8S_23)
🪛 Trivy (0.69.3)
[error] 10-13: Root file system is not read-only
Container 'agentic-operator' of Deployment 'agentic-operator' should set 'securityContext.readOnlyRootFilesystem' to true
Rule: KSV-0014
(IaC/Kubernetes)
[error] 10-13: Default security context configured
container agentic-operator in default namespace is using the default security context
Rule: KSV-0118
(IaC/Kubernetes)
[error] 8-13: Default security context configured
deployment agentic-operator in default namespace is using the default security context, which allows root privileges
Rule: KSV-0118
(IaC/Kubernetes)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@components/manifests/overlays/app-interface/operator-runner-image-patch.yaml`
around lines 12 - 13, The environment variable AMBIENT_CODE_RUNNER_IMAGE is
pinned to the mutable tag ":latest" which makes deployments non-deterministic;
replace the value string in operator-runner-image-patch.yaml for
AMBIENT_CODE_RUNNER_IMAGE with an immutable reference (either a specific version
tag like ":v1.2.3" or a content-addressable digest "@" format such as
"@sha256:..."). Locate the AMBIENT_CODE_RUNNER_IMAGE entry and update its value
to the chosen immutable tag or digest, then verify image pull succeeds and
update any release notes/CI that publish or reference the runner image.
| # SSL required: Using external AWS RDS database | ||
| - --db-sslmode=require |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify current DB TLS posture across manifests and server flags (read-only).
set -euo pipefail
echo "== DB sslmode flags in manifests =="
rg -n --type=yaml -C2 'db-sslmode='
echo
echo "== Any ssl root CA / cert verification related flags in repo =="
rg -n -C2 'db-sslroot|sslrootcert|verify-full|verify-ca|rds-ca|ca-cert'Repository: ambient-code/platform
Length of output: 15600
🏁 Script executed:
cat -n components/manifests/overlays/production/ambient-api-server-jwt-args-patch.yamlRepository: ambient-code/platform
Length of output: 3641
Add database certificate verification to production RDS configuration.
Line 37 uses --db-sslmode=require, which encrypts DB traffic but does not verify the server identity. For production RDS, switch to verify-ca or verify-full and supply the RDS CA bundle via --db-sslrootcert (or equivalent environment variable). The platform already supports these modes per the template parameter documentation.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@components/manifests/overlays/production/ambient-api-server-jwt-args-patch.yaml`
around lines 36 - 37, Replace the insecure flag `--db-sslmode=require` with a
verification mode and provide the RDS CA bundle: change `--db-sslmode=require`
to `--db-sslmode=verify-full` (or `verify-ca` if hostname verification is
undesirable) and add a `--db-sslrootcert=/path/to/rds-ca.pem` argument; also
mount the RDS CA (from the existing template parameter/secret/configmap) into
the container and point the `--db-sslrootcert` to that mounted file so the DB
client actually verifies the server certificate.
| initContainers: | ||
| - name: migration | ||
| command: | ||
| - /usr/local/bin/ambient-api-server | ||
| - migrate | ||
| - --db-host-file=/secrets/db/db.host | ||
| - --db-port-file=/secrets/db/db.port | ||
| - --db-user-file=/secrets/db/db.user | ||
| - --db-password-file=/secrets/db/db.password | ||
| - --db-name-file=/secrets/db/db.name | ||
| - --db-sslmode=require | ||
| - --alsologtostderr | ||
| - -v=4 |
There was a problem hiding this comment.
Missing resources and securityContext on production init container.
The migration init container lacks both a resources block (required by guidelines) and a securityContext. In production this means: no CPU/memory guardrails, root filesystem writable, and privilege escalation not blocked.
🛡️ Proposed addition
initContainers:
- name: migration
+ securityContext:
+ allowPrivilegeEscalation: false
+ readOnlyRootFilesystem: true
+ runAsNonRoot: true
+ resources:
+ requests:
+ cpu: "100m"
+ memory: "128Mi"
+ limits:
+ cpu: "500m"
+ memory: "256Mi"
command:As per coding guidelines: "Resource limits/requests required on containers."
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| initContainers: | |
| - name: migration | |
| command: | |
| - /usr/local/bin/ambient-api-server | |
| - migrate | |
| - --db-host-file=/secrets/db/db.host | |
| - --db-port-file=/secrets/db/db.port | |
| - --db-user-file=/secrets/db/db.user | |
| - --db-password-file=/secrets/db/db.password | |
| - --db-name-file=/secrets/db/db.name | |
| - --db-sslmode=require | |
| - --alsologtostderr | |
| - -v=4 | |
| initContainers: | |
| - name: migration | |
| securityContext: | |
| allowPrivilegeEscalation: false | |
| readOnlyRootFilesystem: true | |
| runAsNonRoot: true | |
| resources: | |
| requests: | |
| cpu: "100m" | |
| memory: "128Mi" | |
| limits: | |
| cpu: "500m" | |
| memory: "256Mi" | |
| command: | |
| - /usr/local/bin/ambient-api-server | |
| - migrate | |
| - --db-host-file=/secrets/db/db.host | |
| - --db-port-file=/secrets/db/db.port | |
| - --db-user-file=/secrets/db/db.user | |
| - --db-password-file=/secrets/db/db.password | |
| - --db-name-file=/secrets/db/db.name | |
| - --db-sslmode=require | |
| - --alsologtostderr | |
| - -v=4 |
🧰 Tools
🪛 Checkov (3.2.526)
[medium] 2-21: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
🪛 Trivy (0.69.3)
[error] 10-21: Root file system is not read-only
Container 'migration' of Deployment 'ambient-api-server' should set 'securityContext.readOnlyRootFilesystem' to true
Rule: KSV-0014
(IaC/Kubernetes)
[error] 10-21: Default security context configured
container ambient-api-server in default namespace is using the default security context
Rule: KSV-0118
(IaC/Kubernetes)
[error] 8-21: Default security context configured
deployment ambient-api-server in default namespace is using the default security context, which allows root privileges
Rule: KSV-0118
(IaC/Kubernetes)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@components/manifests/overlays/production/ambient-api-server-migration-ssl-patch.yaml`
around lines 9 - 21, The init container named "migration" is missing a resources
block and a securityContext; add a resources section on the migration
initContainer with required requests and limits for cpu and memory (both
requests and limits) per your guideline, and add a securityContext to the same
container (runAsNonRoot: true, privileged: false, allowPrivilegeEscalation:
false, readOnlyRootFilesystem: true, and drop all capabilities) so the init
container has CPU/memory guardrails and restricted privileges.
| initContainers: | ||
| - command: | ||
| - /usr/local/bin/ambient-api-server | ||
| - migrate | ||
| - --db-host-file=/secrets/db/db.host | ||
| - --db-port-file=/secrets/db/db.port | ||
| - --db-user-file=/secrets/db/db.user | ||
| - --db-password-file=/secrets/db/db.password | ||
| - --db-name-file=/secrets/db/db.name | ||
| - --db-sslmode=require | ||
| - --alsologtostderr | ||
| - -v=4 | ||
| image: ${IMAGE_AMBIENT_API_SERVER}:${IMAGE_TAG} | ||
| imagePullPolicy: Always | ||
| name: migration | ||
| securityContext: | ||
| allowPrivilegeEscalation: false | ||
| capabilities: | ||
| drop: | ||
| - ALL | ||
| readOnlyRootFilesystem: false | ||
| volumeMounts: | ||
| - mountPath: /secrets/db | ||
| name: db-secrets | ||
| serviceAccountName: ambient-api-server |
There was a problem hiding this comment.
Add requests/limits to the migration initContainer.
The migration initContainer has no resources, so scheduling and eviction behavior falls back to cluster defaults. Please set small requests/limits here, and mirror that in the sibling init-db patches using the same pattern. As per coding guidelines, components/manifests/**/*.yaml: Resource limits/requests required on containers.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@components/manifests/templates/template-services.yaml` around lines 304 -
328, The migration initContainer named "migration" is missing resource
requests/limits; add a small resources block (cpu and memory requests and
limits) to the migration initContainer and apply the same pattern to the sibling
init-db initContainers so they mirror each other; locate the initContainer with
name: migration and add a resources: section (requests: cpu/memory and limits:
cpu/memory) with conservative values appropriate for short-lived migrations to
ensure proper scheduling and eviction behavior.
Increasing
INITIAL_PROMPT_DELAY_SECONDSto 10 seconds. Currently, 2 seconds is not enough time for things to come up. The initial prompt is failing if a user tries to send a query.Summary by CodeRabbit
New Features
Bug Fixes
Chores