Skip to content

Commit addebcb

Browse files
jmagakGitHub Actions
andauthored
RHIDP-9238: Troubleshooting Guide (#1628)
* Troubleshooting Guide * Troubleshooting Guide * Troubleshooting guide * Troubleshooting guide * Troubleshooting guide * Troubleshooting guide * Apply new suggestions * Apply new suggestions * Apply new suggestions * Apply suggestions * Apply suggestions * Apply new suggestions * Apply sugegstions * Fix indentatation --------- Co-authored-by: GitHub Actions <[email protected]>
1 parent 224a4d9 commit addebcb

6 files changed

+350
-0
lines changed
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
:_mod-docs-content-type: ASSEMBLY
2+
3+
[id="assembly-orchestrator-troubleshooting-serverless-workflows_{context}"]
4+
5+
= Diagnose and resolve serverless workflow issues
6+
7+
Use the following information to diagnose and resolve serverless workflow and visibility issues.
8+
9+
// HTTP error codes in workflows
10+
include::modules/orchestrator/ref-troubleshoot-workflow-http-error-codes.adoc[leveloffset=+1]
11+
12+
// Workflow errors
13+
include::modules/orchestrator/proc-workflow-deployment-errors.adoc[leveloffset=+1]
14+
15+
// Common SonataFlow configuration issues
16+
include::modules/orchestrator/proc-troubleshoot-sonataflow-cross-namespace-issues.adoc[leveloffset=+1]
17+
18+
// Troubleshooting workflows missing from the {product-very-short} UI
19+
include::modules/orchestrator/proc-troubleshooting-missing-workflows.adoc[leveloffset=+1]
Lines changed: 163 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,163 @@
1+
:_mod-docs-content-type: PROCEDURE
2+
3+
[id="proc-troubleshoot-sonataflow-cross-namespace-issues_{context}"]
4+
= Troubleshooting cross-namespace SonataFlow configuration and deployment issues
5+
6+
Use this procedure to resolve configuration and deployment failures when SonataFlow workflows are installed in a namespace separate from the core services, or if the Data Index fails to connect to the PostgreSQL database.
7+
8+
.Prerequisites
9+
* You have administrator privileges to access the OpenShift cluster.
10+
11+
.Procedure
12+
13+
. Identify required namespaces.
14+
15+
* Retrieve the namespace value where {product-very-short} is running using `oc get backstage -A`.
16+
17+
* Identify the SonataFlow Services Namespace by checking for either a `sonataflowclusterplatform` or `sonataflowplatform` instance.
18+
+
19+
[NOTE]
20+
====
21+
By default, the SonataFlow namespace must be the same as the {product-very-short} namespace.
22+
====
23+
24+
. If the workflow is deployed to a namespace outside the core SonataFlow services, configure network policies to permit the necessary inter-namespace traffic.
25+
+
26+
[source,subs="+attributes,+quotes"]
27+
----
28+
# Example `NetworkPolicy` configuration to ingress traffic into the workflow namespace
29+
apiVersion: networking.k8s.io/v1
30+
kind: NetworkPolicy
31+
metadata:
32+
name: {{ .Release.Name }}-allow-infra-ns-to-workflow-ns
33+
# Sonataflow and Workflows are using the {product-very-short} target namespace.
34+
namespace: {{ .Release.Namespace | quote }}
35+
spec:
36+
podSelector: {}
37+
ingress:
38+
- from:
39+
- namespaceSelector:
40+
matchLabels:
41+
# Allow knative events to be delivered to workflows.
42+
kubernetes.io/metadata.name: knative-eventing
43+
- namespaceSelector:
44+
matchLabels:
45+
# Allow auxiliary knative function for workflow (such as m2k-save-transformation)
46+
kubernetes.io/metadata.name: knative-serving
47+
- namespaceSelector:
48+
matchLabels:
49+
# Allow communication between the serverless logic operator and the workflow namespace.
50+
kubernetes.io/metadata.name: openshift-serverless-logic
51+
----
52+
53+
. Add `SonataFlowClusterPlatform` Custom Resource as shown in the following configuration:
54+
+
55+
[source,yaml]
56+
----
57+
oc create -f - <<EOF
58+
apiVersion: sonataflow.org/v1alpha08
59+
kind: SonataFlowClusterPlatform
60+
metadata:
61+
name: cluster-platform
62+
spec:
63+
platformRef:
64+
name: sonataflow-platform
65+
namespace: $RHDH_NAMESPACE
66+
----
67+
68+
. To allow communication between {product-very-short} namespace and the workflow namespace, create the following network policies:
69+
70+
.. Allow {product-very-short} services to accept traffic from workflows. Create an additional network policy within the {product-very-short} instance namespace as shown in the following configuration::
71+
+
72+
[source,yaml]
73+
----
74+
oc create -f - <<EOF
75+
apiVersion: networking.k8s.io/v1
76+
kind: NetworkPolicy
77+
metadata:
78+
name: allow-external-workflows-to-rhdh
79+
# Namespace where network policies are deployed
80+
namespace: $RHDH_NAMESPACE
81+
spec:
82+
podSelector: {}
83+
ingress:
84+
- from:
85+
- namespaceSelector:
86+
matchLabels:
87+
# Allow SonataFlow services to communicate with new/additional workflow namespace.
88+
kubernetes.io/metadata.name: $ADDITIONAL_WORKFLOW_NAMESPACE
89+
----
90+
91+
.. Allow traffic from {product-very-short}, SonataFlow and Knative. Create a network policy within the additional workflow namespace as shown in the following configuration:
92+
+
93+
[source,yaml]
94+
----
95+
oc create -f - <<EOF
96+
apiVersion: networking.k8s.io/v1
97+
kind: NetworkPolicy
98+
metadata:
99+
name: allow-rhdh-and-knative-to-workflows
100+
namespace: $ADDITIONAL_WORKFLOW_NAMESPACE
101+
spec:
102+
podSelector: {}
103+
ingress:
104+
- from:
105+
- namespaceSelector:
106+
matchLabels:
107+
# Allows traffic from pods in the {product-very-short} namespace.
108+
kubernetes.io/metadata.name: $RHDH_NAMESPACE
109+
- namespaceSelector:
110+
matchLabels:
111+
# Allows traffic from pods in the Knative Eventing namespace.
112+
kubernetes.io/metadata.name: knative-eventing
113+
- namespaceSelector:
114+
matchLabels:
115+
# Allows traffic from pods in the Knative Serving namespace.
116+
kubernetes.io/metadata.name: knative-serving
117+
----
118+
119+
. (Optional) Create an `allow-intra-namespace` policy in the workflow namespace to enable unrestricted communication among all pods within that namespace.
120+
121+
. If workflow persistence is required, perform the following configuration steps:
122+
123+
.. Create a dedicated PostgreSQL Secret containing database credentials within the workflow namespace as shown in the following configuration:
124+
+
125+
[source,yaml]
126+
----
127+
oc get secret sonataflow-psql-postgresql -n <your_namespace> -o yaml > secret.yaml
128+
sed -i '/namespace: <your_namespace>/d' secret.yaml
129+
oc apply -f secret.yaml -n $ADDITIONAL_NAMESPACE
130+
----
131+
132+
.. Configure the workflow `serviceRef` property to correctly reference the PostgreSQL service namespace as shown in the following configuration:
133+
+
134+
[source,yaml]
135+
----
136+
apiVersion: sonataflow.org/v1alpha08
137+
kind: SonataFlow
138+
...
139+
spec:
140+
...
141+
persistence:
142+
postgresql:
143+
secretRef:
144+
name: sonataflow-psql-postgresql
145+
passwordKey: postgres-password
146+
userKey: postgres-username
147+
serviceRef:
148+
databaseName: sonataflow
149+
databaseSchema: greeting
150+
name: sonataflow-psql-postgresql
151+
namespace: $POSTGRESQL_NAMESPACE
152+
port: 5432
153+
----
154+
+
155+
`namespace`::
156+
Enter the namespace where the PostgreSQL server is deployed.
157+
158+
. If the `sonataflow-platform-data-index-service` cannot connect to the PostgreSQL database on startup, perform the following diagnostic checks:
159+
160+
.. Verify that the PostgreSQL Pod has fully transitioned to a `running` and operational status.
161+
Allow additional time for database initialization before expecting related service pods (`DataIndex`, `JobService`) to establish a connection.
162+
163+
.. If the PostgreSQL Server operates in a dedicated namespace (for example, outside {product-very-short}), verify that network policies are configured to allow ingress traffic from the SonataFlow services namespace. Network policies might prevent the Data Index and Job Service pods from connecting to the database.
Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
:_mod-docs-content-type: PROCEDURE
2+
3+
[id="proc-troubleshooting-missing-workflows_{context}"]
4+
= Troubleshooting workflows missing from the {product-very-short} UI
5+
6+
You can perform the following checks to verify the workflow status and connectivity when the deployed workflow is missing from the {product-very-short} Orchestrator UI.
7+
8+
.Prerequisites
9+
10+
* You have administrator privileges to access the OpenShift cluster where {product-very-short} and SonataFlow services are running.
11+
12+
.Procedure
13+
14+
. Verify if the workflow uses GitOps profile. The {product-very-short} Orchestrator UI displays only the workflows that use this profile. Make sure the workflow definition and the SonataFlow manifests use the GitOps profile.
15+
16+
. Verify that the workflow pod has started and is ready. The readiness of a workflow pod depends on its successful registration with the Data Index. When a workflow initializes, it performs the following actions:
17+
.. It attempts to create its schema in the database (if persistence is active).
18+
.. It attempts to register itself to the Data Index. The workflow pod remains in an unready state until it successfully registers to the Data Index.
19+
+
20+
Check the workflow deployment for additional status and error messages that might be unavailable in the pod log.
21+
22+
. Check if the workflow pod can reach the Data Index service. Connect to the workflows pod and send the following GraphQL request to the Data Index:
23+
+
24+
[source,subs="+attributes,+quotes"]
25+
----
26+
curl -g -k -X POST -H "Content-Type: application/json" \
27+
-d '{"query":"query{ ProcessDefinitions { id, serviceUrl, endpoint } }"}' \
28+
http://sonataflow-platform-data-index-service.<your_namespace>/graphql
29+
----
30+
+
31+
Use the Data Index service and namespace as defined in your environment. By default, this is the same namespace where {product-very-short} is installed. If your SonataFlow resources are installed in a separate namespace, use `<your_namespace>`. Check if the {product-very-short} pod can reach the workflow service by running the following command:
32+
+
33+
[source,bash]
34+
----
35+
curl http://<workflow_service>.<workflow_namespace>/management/processes
36+
----
37+
38+
. Connect to the {product-very-short} pod. Verify its connection to the Data Index service and inspect the {product-very-short} pod logs for messages from the Orchestrator plugin.
39+
+
40+
To inspect the logs, identify the {product-very-short} pod and run the following *oc logs* command:
41+
+
42+
[source,yaml]
43+
----
44+
oc get pods -n <your_namespace>
45+
oc logs <rhdh_pod_name> -n <your_namespace>
46+
----
47+
+
48+
You must find messages indicating it is attempting to fetch workflow information from the Data Index, similar to the following:
49+
+
50+
[source,yaml]
51+
----
52+
{"level":"\u001b[32minfo\u001b[39m","message":"fetchWorkflowInfos() called: http://sonataflow-platform-data-index-service.<your_namespace>","plugin":"orchestrator","service":"backstage","span_id":"fca4ab29f0a7aef9","timestamp":"2025-08-04 17:58:26","trace_flags":"01","trace_id":"5408d4b06373ff8fb34769083ef771dd"}
53+
----
54+
+
55+
Notice the _"plugin":"orchestrator"_ that can help to filter the messages.
56+
57+
. Make sure the Data Index properties are set in the `-managed-props` ConfigMap of the workflow as shown in the following configuration:
58+
+
59+
[source,yaml]
60+
----
61+
kogito.data-index.health-enabled = true
62+
kogito.data-index.url = http://sonataflow-platform-data-index-service.<your_namespace>
63+
...
64+
mp.messaging.outgoing.kogito-processdefinitions-events.url = http://sonataflow-platform-data-index-service.<your_namespace>/definitions
65+
mp.messaging.outgoing.kogito-processinstances-events.url = http://sonataflow-platform-data-index-service.<your_namespace>/processes
66+
----
67+
+
68+
[NOTE]
69+
====
70+
The `-managed-props` ConfigMap is located in the same namespace as the workflow and is generated by the Openshift Serverless Logic (OSL) Operator.
71+
====
72+
+
73+
These properties, along with similar settings for the Job Services, indicate that the (OSL) Operator successfully registered the Data Index service.
74+
75+
. Confirm that the workflow is registered in the Data Index database. Connect to the database used by the Data Index and run the following command from the PSQL instance pod:
76+
+
77+
[source,bash]
78+
----
79+
PGPASSWORD=<psql password> psql -h localhost -p 5432 -U < user> -d sonataflow
80+
----
81+
+
82+
Replace `<psql password>` and `<user>` with your database credentials.
83+
+
84+
Run the following SQL commands to query the registered workflow definitions:
85+
+
86+
[source,yaml]
87+
----
88+
sonataflow=# SET search_path TO "sonataflow-platform-data-index-service";
89+
sonataflow=# select id, name from definitions;
90+
----
91+
+
92+
You must see your workflows listed in the query results.
93+
94+
. Make sure you have enabled Data Index and Job Service in the `SonataFlowPlatform` custom resource (CR) as shown in the following configuration:
95+
+
96+
[source,yaml]
97+
----
98+
services:
99+
dataIndex:
100+
enabled: true
101+
jobService:
102+
enabled: true
103+
----
104+
+
105+
If you fail to enable the Data Index and the Job Services in the `SonataFlowPlatform` custom resource (CR), the Orchestrator plugin fails to fetch the available workflows.
106+
+
107+
[NOTE]
108+
====
109+
You can also manually edit the `SonataFlowPlatform` CR instance to trigger the re-creation of workflow-related manifests.
110+
====
111+
112+
. Set the RBAC permissions correctly. For more information, see {authorization-book-link}#managing-authorizations-by-using-the-rest-api[RBAC documentation].
113+
114+
[role="_additional-resources"]
115+
.Additional resources
116+
117+
* {monitoring-and-logging-book-link}#configuring-the-application-log-level-by-using-the-operator_assembly-rhdh-observability[Configuring the application log level by using the {product} Operator]
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
:_mod-docs-content-type: PROCEDURE
2+
3+
[id="proc-workflow-deployment-errors_{context}"]
4+
= Troubleshooting common workflow deployment errors
5+
6+
Use these steps to diagnose and resolve common workflow deployment, connectivity, or configuration failures.
7+
8+
.Procedure
9+
10+
. If the workflow operation fails, examine the container log of the specific workflow instance to determine the cause by running the following command:
11+
+
12+
[source,terminal]
13+
----
14+
$ oc logs my-workflow-xy73lj
15+
----
16+
17+
. If the workflow fails to reach an HTTPS endpoint, check the pod log for an SSL certificate verification failure. This occurs if the target endpoint uses a Certificate Authority (CA) that the workflow cannot verify. The resulting error resembles the following:
18+
+
19+
[source,yaml]
20+
----
21+
sun.security.provider.certpath.SunCertPathBuilderException - unable to find valid certification path to requested target
22+
----
23+
24+
. To resolve the SSL certificate error, load the additional CA certificate into the running workflow container.
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
:_mod-docs-content-type: REFERENCE
2+
3+
[id="ref-troubleshoot-workflow-http-error-codes_{context}"]
4+
= Troubleshoot workflow HTTP error codes
5+
6+
Workflow operations fail when a service endpoint returns an HTTP error code. The user interface displays the HTTP code and error message. See link:https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Status[external documentation] for a complete list of HTTP status code meanings.
7+
8+
The following table lists common HTTP errors encountered during workflow execution:
9+
10+
[cols="25%,25%,50%", frame="all", options="header"]
11+
|===
12+
|HTTP code|Description|Possible cause
13+
14+
|`401`
15+
|Unauthorized access
16+
|The token, password, or username provided for the endpoint might be incorrect or expired.
17+
18+
|`403`
19+
|Forbidden
20+
|The server understood the request but refused to process it due to insufficient permissions to a resource or action.
21+
22+
|`409`
23+
|Conflict
24+
|The workflow attempted to create or update a resource (for example, Kubernetes or OpenShift resources) that already exists.
25+
|===

titles/orchestrator/master.adoc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,4 +20,6 @@ include::assemblies/assembly-install-rhdh-orchestrator-plugin-in-an-air-gapped-e
2020

2121
include::assemblies/assembly-building-and-deploying-serverless-workflows.adoc[leveloffset=+1]
2222

23+
include::assemblies/assembly-orchestrator-troubleshooting-serverless-workflows.adoc[leveloffset=+1]
24+
2325
include::assemblies/assembly-orchestrator-plugins-technical-appendixes.adoc[leveloffset=+1]

0 commit comments

Comments
 (0)