-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gRPC via nginx ingress #11882
Comments
The AgentService gRPC client receives RST_STREAM from the server, the client doesn't know any other details, you should check the Prometheus |
Thanks for the reply @kannanjgithub I've attached the server log Server log14:24:06.400 WARN [ProxyPathManager.kt:139] - Missing agent context for agentId: 2483 (Termination) [grpc-nio-worker-ELG-3-3]
14:24:06.401 WARN [AgentContextManager.kt:53] - Missing AgentContext for agentId: 2483 (Termination) [grpc-nio-worker-ELG-3-3]
14:24:06.401 INFO [ProxyServerTransportFilter.kt:46] - Disconnected with invalid agentId: 2483 [grpc-nio-worker-ELG-3-3]
14:24:08.070 INFO [ProxyPathManager.kt:141] - Removing paths for agentId: 2486 (Termination) [grpc-nio-worker-ELG-3-1]
14:24:08.070 INFO [AgentContextManager.kt:56] - Removed AgentContext{agentId=2486, launchId=Unassigned, consolidated=false, valid=true, agentName=Unassigned, hostName=Unassigned, remoteAddr=Unknown, lastRequestDuration=59.998507193s} for agentId: 2486 (Termination) [grpc-nio-worker-ELG-3-1]
14:24:08.070 INFO [ProxyServerTransportFilter.kt:46] - Disconnected from AgentContext{agentId=2486, launchId=Unassigned, consolidated=false, valid=false, agentName=Unassigned, hostName=Unassigned, remoteAddr=Unknown, lastRequestDuration=59.998573739s} [grpc-nio-worker-ELG-3-1]
14:24:11.897 INFO [ProxyPathManager.kt:141] - Removing paths for agentId: 2485 (Termination) [grpc-nio-worker-ELG-3-2]
14:24:11.897 INFO [AgentContextManager.kt:56] - Removed AgentContext{agentId=2485, launchId=Unassigned, consolidated=false, valid=true, agentName=Unassigned, hostName=Unassigned, remoteAddr=Unknown, lastRequestDuration=1m 3.825870393s} for agentId: 2485 (Termination) [grpc-nio-worker-ELG-3-2]
14:24:11.897 INFO [ProxyServerTransportFilter.kt:46] - Disconnected from AgentContext{agentId=2485, launchId=Unassigned, consolidated=false, valid=false, agentName=Unassigned, hostName=Unassigned, remoteAddr=Unknown, lastRequestDuration=1m 3.825934448s} [grpc-nio-worker-ELG-3-2]
14:24:11.898 INFO [ProxyPathManager.kt:141] - Removing paths for agentId: 2484 (Termination) [grpc-nio-worker-ELG-3-4]
14:24:11.898 INFO [ProxyPathManager.kt:150] - Removed path /test-agent-1 for AgentContextInfo(consolidated=false, labels={},agentContexts=[AgentContext{agentId=2484, launchId=CRudPRd53VSWxFS, consolidated=false, valid=true, agentName=test-agent-1, hostName=prometheus-proxy.service.net, remoteAddr=Unknown, lastRequestDuration=15.745416063s}]) [grpc-nio-worker-ELG-3-4]
14:24:11.898 INFO [AgentContextManager.kt:56] - Removed AgentContext{agentId=2484, launchId=CRudPRd53VSWxFS, consolidated=false, valid=true, agentName=test-agent-1, hostName=prometheus-proxy.service.net, remoteAddr=Unknown, lastRequestDuration=15.745473072s} for agentId: 2484 (Termination) [grpc-nio-worker-ELG-3-4]
14:24:11.898 INFO [ProxyServerTransportFilter.kt:46] - Disconnected from AgentContext{agentId=2484, launchId=CRudPRd53VSWxFS, consolidated=false, valid=false, agentName=test-agent-1, hostName=prometheus-proxy.service.net, remoteAddr=Unknown, lastRequestDuration=15.745532372s} [grpc-nio-worker-ELG-3-4]
14:24:12.453 INFO [AgentContextManager.kt:40] - Registering agentId: 2488 [grpc-nio-worker-ELG-3-4]
14:24:12.651 INFO [ProxyServiceImpl.kt:96] - Connected to AgentContext{agentId=2488, launchId=CRudPRd53VSWxFS, consolidated=false, valid=true, agentName=test-agent-1, hostName=prometheus-proxy.service.net, remoteAddr=Unknown, lastRequestDuration=197.759281ms} [DefaultDispatcher-worker-5]
14:24:12.819 INFO [ProxyPathManager.kt:82] - Added path /test-agent-1 for AgentContext{agentId=2488, launchId=CRudPRd53VSWxFS, consolidated=false, valid=true, agentName=test-agent-1, hostName=prometheus-proxy.service.net, remoteAddr=Unknown, lastRequestDuration=366.072715ms} [DefaultDispatcher-worker-6]
14:24:13.043 INFO [AgentContextManager.kt:40] - Registering agentId: 2490 [grpc-nio-worker-ELG-3-2]
14:24:13.043 INFO [AgentContextManager.kt:40] - Registering agentId: 2489 [grpc-nio-worker-ELG-3-1]
14:24:16.738 INFO [AgentContextCleanupService.kt:50] - Evicting agentId 2487 after 1m 3.664075204s (max 1m) of inactivity: AgentContext{agentId=2487, launchId=Unassigned, consolidated=false, valid=true, agentName=Unassigned, hostName=Unassigned, remoteAddr=Unknown, lastRequestDuration=1m 3.664095634s} [AgentContextCleanupService]
14:24:16.738 INFO [ProxyPathManager.kt:141] - Removing paths for agentId: 2487 (Eviction) [AgentContextCleanupService]
14:24:16.738 INFO [AgentContextManager.kt:56] - Removed AgentContext{agentId=2487, launchId=Unassigned, consolidated=false, valid=true, agentName=Unassigned, hostName=Unassigned, remoteAddr=Unknown, lastRequestDuration=1m 3.664272109s} for agentId: 2487 (Eviction) [AgentContextCleanupService]
14:24:18.003 INFO [AgentContextManager.kt:40] - Registering agentId: 2491 [grpc-nio-worker-ELG-3-3]
14:24:26.111 INFO [CallLogging.kt:45] - 200 OK: GET - /test-agent-1 - prometheus-kube-prometheus-stack-prometheus-0.prometheus-operated.monitoring.svc.cluster.local [DefaultDispatcher-worker-4]
14:24:56.140 INFO [CallLogging.kt:45] - 200 OK: GET - /test-agent-1 - prometheus-kube-prometheus-stack-prometheus-0.prometheus-operated.monitoring.svc.cluster.local [DefaultDispatcher-worker-6]
14:25:11.935 WARN [ProxyPathManager.kt:139] - Missing agent context for agentId: 2487 (Termination) [grpc-nio-worker-ELG-3-3]
14:25:11.935 WARN [AgentContextManager.kt:53] - Missing AgentContext for agentId: 2487 (Termination) [grpc-nio-worker-ELG-3-3]
14:25:11.935 INFO [ProxyServerTransportFilter.kt:46] - Disconnected with invalid agentId: 2487 [grpc-nio-worker-ELG-3-3]
14:25:13.042 INFO [ProxyPathManager.kt:141] - Removing paths for agentId: 2489 (Termination) [grpc-nio-worker-ELG-3-1]
14:25:13.042 INFO [AgentContextManager.kt:56] - Removed AgentContext{agentId=2489, launchId=Unassigned, consolidated=false, valid=true, agentName=Unassigned, hostName=Unassigned, remoteAddr=Unknown, lastRequestDuration=59.998934897s} for agentId: 2489 (Termination) [grpc-nio-worker-ELG-3-1]
14:25:13.042 INFO [ProxyServerTransportFilter.kt:46] - Disconnected from AgentContext{agentId=2489, launchId=Unassigned, consolidated=false, valid=false, agentName=Unassigned, hostName=Unassigned, remoteAddr=Unknown, lastRequestDuration=59.999030482s} [grpc-nio-worker-ELG-3-1]
14:25:16.739 INFO [AgentContextCleanupService.kt:50] - Evicting agentId 2490 after 1m 3.696316872s (max 1m) of inactivity: AgentContext{agentId=2490, launchId=Unassigned, consolidated=false, valid=true, agentName=Unassigned, hostName=Unassigned, remoteAddr=Unknown, lastRequestDuration=1m 3.696337839s} [AgentContextCleanupService]
14:25:16.739 INFO [ProxyPathManager.kt:141] - Removing paths for agentId: 2490 (Eviction) [AgentContextCleanupService]
14:25:16.739 INFO [AgentContextManager.kt:56] - Removed AgentContext{agentId=2490, launchId=Unassigned, consolidated=false, valid=true, agentName=Unassigned, hostName=Unassigned, remoteAddr=Unknown, lastRequestDuration=1m 3.696527758s} for agentId: 2490 (Eviction) [AgentContextCleanupService]
14:25:17.007 INFO [ProxyPathManager.kt:141] - Removing paths for agentId: 2488 (Termination) [grpc-nio-worker-ELG-3-4]
14:25:17.007 WARN [ProxyPathManager.kt:139] - Missing agent context for agentId: 2490 (Termination) [grpc-nio-worker-ELG-3-2]
14:25:17.007 INFO [ProxyPathManager.kt:150] - Removed path /test-agent-1 for AgentContextInfo(consolidated=false, labels={},agentContexts=[AgentContext{agentId=2488, launchId=CRudPRd53VSWxFS, consolidated=false, valid=true, agentName=test-agent-1, hostName=prometheus-proxy.service.net, remoteAddr=Unknown, lastRequestDuration=20.878498248s}]) [grpc-nio-worker-ELG-3-4]
14:25:17.007 WARN [AgentContextManager.kt:53] - Missing AgentContext for agentId: 2490 (Termination) [grpc-nio-worker-ELG-3-2]
14:25:17.008 INFO [ProxyServerTransportFilter.kt:46] - Disconnected with invalid agentId: 2490 [grpc-nio-worker-ELG-3-2]
14:25:17.008 INFO [AgentContextManager.kt:56] - Removed AgentContext{agentId=2488, launchId=CRudPRd53VSWxFS, consolidated=false, valid=true, agentName=test-agent-1, hostName=prometheus-proxy.service.net, remoteAddr=Unknown, lastRequestDuration=20.878568563s} for agentId: 2488 (Termination) [grpc-nio-worker-ELG-3-4]
14:25:17.008 INFO [ProxyServerTransportFilter.kt:46] - Disconnected from AgentContext{agentId=2488, launchId=CRudPRd53VSWxFS, consolidated=false, valid=false, agentName=test-agent-1, hostName=prometheus-proxy.service.net, remoteAddr=Unknown, lastRequestDuration=20.878878457s} [grpc-nio-worker-ELG-3-4]
14:25:17.617 INFO [AgentContextManager.kt:40] - Registering agentId: 2492 [grpc-nio-worker-ELG-3-4]
14:25:17.786 INFO [ProxyServiceImpl.kt:96] - Connected to AgentContext{agentId=2492, launchId=CRudPRd53VSWxFS, consolidated=false, valid=true, agentName=test-agent-1, hostName=prometheus-proxy.service.net, remoteAddr=Unknown, lastRequestDuration=169.595291ms} [DefaultDispatcher-worker-4]
14:25:17.956 INFO [ProxyPathManager.kt:82] - Added path /test-agent-1 for AgentContext{agentId=2492, launchId=CRudPRd53VSWxFS, consolidated=false, valid=true, agentName=test-agent-1, hostName=prometheus-proxy.service.net, remoteAddr=Unknown, lastRequestDuration=339.060823ms} [DefaultDispatcher-worker-4]
14:25:18.127 INFO [AgentContextManager.kt:40] - Registering agentId: 2493 [grpc-nio-worker-ELG-3-1]
14:25:18.127 INFO [AgentContextManager.kt:40] - Registering agentId: 2494 [grpc-nio-worker-ELG-3-2]
14:25:23.133 INFO [AgentContextManager.kt:40] - Registering agentId: 2495 [grpc-nio-worker-ELG-3-3]
14:25:26.269 INFO [CallLogging.kt:45] - 200 OK: GET - /test-agent-1 - prometheus-kube-prometheus-stack-prometheus-0.prometheus-operated.monitoring.svc.cluster.local [DefaultDispatcher-worker-8]
14:25:26.740 INFO [AgentContextCleanupService.kt:50] - Evicting agentId 2491 after 1m 8.736333250s (max 1m) of inactivity: AgentContext{agentId=2491, launchId=Unassigned, consolidated=false, valid=true, agentName=Unassigned, hostName=Unassigned, remoteAddr=Unknown, lastRequestDuration=1m 8.736353560s} [AgentContextCleanupService]
14:25:26.740 INFO [ProxyPathManager.kt:141] - Removing paths for agentId: 2491 (Eviction) [AgentContextCleanupService]
14:25:26.740 INFO [AgentContextManager.kt:56] - Removed AgentContext{agentId=2491, launchId=Unassigned, consolidated=false, valid=true, agentName=Unassigned, hostName=Unassigned, remoteAddr=Unknown, lastRequestDuration=1m 8.736528364s} for agentId: 2491 (Eviction) [AgentContextCleanupService]
14:25:56.071 INFO [CallLogging.kt:45] - 200 OK: GET - /test-agent-1 - prometheus-kube-prometheus-stack-prometheus-0.prometheus-operated.monitoring.svc.cluster.local [DefaultDispatcher-worker-2]
14:26:16.778 WARN [ProxyPathManager.kt:139] - Missing agent context for agentId: 2491 (Termination) [grpc-nio-worker-ELG-3-3]
14:26:16.779 WARN [AgentContextManager.kt:53] - Missing AgentContext for agentId: 2491 (Termination) [grpc-nio-worker-ELG-3-3]
14:26:16.779 INFO [ProxyServerTransportFilter.kt:46] - Disconnected with invalid agentId: 2491 [grpc-nio-worker-ELG-3-3]
14:26:18.126 INFO [ProxyPathManager.kt:141] - Removing paths for agentId: 2494 (Termination) [grpc-nio-worker-ELG-3-2]
14:26:18.126 INFO [AgentContextManager.kt:56] - Removed AgentContext{agentId=2494, launchId=Unassigned, consolidated=false, valid=true, agentName=Unassigned, hostName=Unassigned, remoteAddr=Unknown, lastRequestDuration=59.999069096s} for agentId: 2494 (Termination) [grpc-nio-worker-ELG-3-2]
14:26:18.126 INFO [ProxyServerTransportFilter.kt:46] - Disconnected from AgentContext{agentId=2494, launchId=Unassigned, consolidated=false, valid=false, agentName=Unassigned, hostName=Unassigned, remoteAddr=Unknown, lastRequestDuration=59.999183830s} [grpc-nio-worker-ELG-3-2]
14:26:22.017 INFO [ProxyPathManager.kt:141] - Removing paths for agentId: 2493 (Termination) [grpc-nio-worker-ELG-3-1]
14:26:22.017 INFO [AgentContextManager.kt:56] - Removed AgentContext{agentId=2493, launchId=Unassigned, consolidated=false, valid=true, agentName=Unassigned, hostName=Unassigned, remoteAddr=Unknown, lastRequestDuration=1m 3.889918337s} for agentId: 2493 (Termination) [grpc-nio-worker-ELG-3-1]
14:26:22.017 INFO [ProxyServerTransportFilter.kt:46] - Disconnected from AgentContext{agentId=2493, launchId=Unassigned, consolidated=false, valid=false, agentName=Unassigned, hostName=Unassigned, remoteAddr=Unknown, lastRequestDuration=1m 3.890007082s} [grpc-nio-worker-ELG-3-1]
14:26:22.018 INFO [ProxyPathManager.kt:141] - Removing paths for agentId: 2492 (Termination) [grpc-nio-worker-ELG-3-4]
14:26:22.018 INFO [ProxyPathManager.kt:150] - Removed path /test-agent-1 for AgentContextInfo(consolidated=false, labels={},agentContexts=[AgentContext{agentId=2492, launchId=CRudPRd53VSWxFS, consolidated=false, valid=true, agentName=test-agent-1, hostName=prometheus-proxy.service.net, remoteAddr=Unknown, lastRequestDuration=25.957533718s}]) [grpc-nio-worker-ELG-3-4]
14:26:22.018 INFO [AgentContextManager.kt:56] - Removed AgentContext{agentId=2492, launchId=CRudPRd53VSWxFS, consolidated=false, valid=true, agentName=test-agent-1, hostName=prometheus-proxy.service.net, remoteAddr=Unknown, lastRequestDuration=25.957604075s} for agentId: 2492 (Termination) [grpc-nio-worker-ELG-3-4]
14:26:22.018 INFO [ProxyServerTransportFilter.kt:46] - Disconnected from AgentContext{agentId=2492, launchId=CRudPRd53VSWxFS, consolidated=false, valid=false, agentName=test-agent-1, hostName=prometheus-proxy.service.net, remoteAddr=Unknown, lastRequestDuration=25.957664659s} [grpc-nio-worker-ELG-3-4]
14:26:22.844 INFO [ProxyServiceImpl.kt:96] - Connected to AgentContext{agentId=2495, launchId=CRudPRd53VSWxFS, consolidated=false, valid=true, agentName=test-agent-1, hostName=prometheus-proxy.service.net, remoteAddr=Unknown, lastRequestDuration=59.711243346s} [DefaultDispatcher-worker-2]
14:26:23.022 INFO [ProxyPathManager.kt:82] - Added path /test-agent-1 for AgentContext{agentId=2495, launchId=CRudPRd53VSWxFS, consolidated=false, valid=true, agentName=test-agent-1, hostName=prometheus-proxy.service.net, remoteAddr=Unknown, lastRequestDuration=59.888860388s} [DefaultDispatcher-worker-2]
14:26:23.331 INFO [AgentContextManager.kt:40] - Registering agentId: 2496 [grpc-nio-worker-ELG-3-1]
14:26:23.331 INFO [AgentContextManager.kt:40] - Registering agentId: 2497 [grpc-nio-worker-ELG-3-4]
14:26:26.114 INFO [CallLogging.kt:45] - 200 OK: GET - /test-agent-1 - prometheus-kube-prometheus-stack-prometheus-0.prometheus-operated.monitoring.svc.cluster.local [DefaultDispatcher-worker-5]
14:26:31.262 INFO [AgentContextManager.kt:40] - Registering agentId: 2498 [grpc-nio-worker-ELG-3-2]
14:26:56.102 INFO [CallLogging.kt:45] - 200 OK: GET - /test-agent-1 - prometheus-kube-prometheus-stack-prometheus-0.prometheus-operated.monitoring.svc.cluster.local [DefaultDispatcher-worker-9]
14:27:23.285 INFO [ProxyPathManager.kt:141] - Removing paths for agentId: 2497 (Termination) [grpc-nio-worker-ELG-3-4]
14:27:23.285 INFO [AgentContextManager.kt:56] - Removed AgentContext{agentId=2497, launchId=Unassigned, consolidated=false, valid=true, agentName=Unassigned, hostName=Unassigned, remoteAddr=Unknown, lastRequestDuration=59.954055439s} for agentId: 2497 (Termination) [grpc-nio-worker-ELG-3-4]
14:27:23.285 INFO [ProxyServerTransportFilter.kt:46] - Disconnected from AgentContext{agentId=2497, launchId=Unassigned, consolidated=false, valid=false, agentName=Unassigned, hostName=Unassigned, remoteAddr=Unknown, lastRequestDuration=59.954180971s} [grpc-nio-worker-ELG-3-4]
14:27:25.922 INFO [ProxyPathManager.kt:141] - Removing paths for agentId: 2496 (Termination) [grpc-nio-worker-ELG-3-1]
14:27:25.922 INFO [ProxyPathManager.kt:141] - Removing paths for agentId: 2495 (Termination) [grpc-nio-worker-ELG-3-3]
14:27:25.923 INFO [ProxyPathManager.kt:150] - Removed path /test-agent-1 for AgentContextInfo(consolidated=false, labels={},agentContexts=[AgentContext{agentId=2495, launchId=CRudPRd53VSWxFS, consolidated=false, valid=true, agentName=test-agent-1, hostName=prometheus-proxy.service.net, remoteAddr=Unknown, lastRequestDuration=29.831453486s}]) [grpc-nio-worker-ELG-3-3]
14:27:25.923 INFO [AgentContextManager.kt:56] - Removed AgentContext{agentId=2496, launchId=Unassigned, consolidated=false, valid=true, agentName=Unassigned, hostName=Unassigned, remoteAddr=Unknown, lastRequestDuration=1m 2.592003977s} for agentId: 2496 (Termination) [grpc-nio-worker-ELG-3-1]
14:27:25.923 INFO [ProxyServerTransportFilter.kt:46] - Disconnected from AgentContext{agentId=2496, launchId=Unassigned, consolidated=false, valid=false, agentName=Unassigned, hostName=Unassigned, remoteAddr=Unknown, lastRequestDuration=1m 2.592294197s} [grpc-nio-worker-ELG-3-1]
14:27:25.923 INFO [AgentContextManager.kt:56] - Removed AgentContext{agentId=2495, launchId=CRudPRd53VSWxFS, consolidated=false, valid=true, agentName=test-agent-1, hostName=prometheus-proxy.service.net, remoteAddr=Unknown, lastRequestDuration=29.831954303s} for agentId: 2495 (Termination) [grpc-nio-worker-ELG-3-3]
14:27:25.923 INFO [ProxyServerTransportFilter.kt:46] - Disconnected from AgentContext{agentId=2495, launchId=CRudPRd53VSWxFS, consolidated=false, valid=false, agentName=test-agent-1, hostName=prometheus-proxy.service.net, remoteAddr=Unknown, lastRequestDuration=29.832124716s} [grpc-nio-worker-ELG-3-3]
14:27:26.204 INFO [CallLogging.kt:45] - 503 Service Unavailable: GET - /test-agent-1 - prometheus-kube-prometheus-stack-prometheus-0.prometheus-operated.monitoring.svc.cluster.local [DefaultDispatcher-worker-1]
14:27:26.441 INFO [AgentContextManager.kt:40] - Registering agentId: 2499 [grpc-nio-worker-ELG-3-3]
14:27:26.639 INFO [ProxyServiceImpl.kt:96] - Connected to AgentContext{agentId=2499, launchId=CRudPRd53VSWxFS, consolidated=false, valid=true, agentName=test-agent-1, hostName=prometheus-proxy.service.net, remoteAddr=Unknown, lastRequestDuration=198.444320ms} [DefaultDispatcher-worker-2]
14:27:26.807 INFO [ProxyPathManager.kt:82] - Added path /test-agent-1 for AgentContext{agentId=2499, launchId=CRudPRd53VSWxFS, consolidated=false, valid=true, agentName=test-agent-1, hostName=prometheus-proxy.service.net, remoteAddr=Unknown, lastRequestDuration=366.150346ms} [DefaultDispatcher-worker-6]
14:27:26.977 INFO [AgentContextManager.kt:40] - Registering agentId: 2500 [grpc-nio-worker-ELG-3-4]
14:27:26.977 INFO [AgentContextManager.kt:40] - Registering agentId: 2501 [grpc-nio-worker-ELG-3-1]
14:27:32.034 INFO [AgentContextManager.kt:40] - Registering agentId: 2502 [grpc-nio-worker-ELG-3-2] |
Those are logs from your Prometheus proxy server that forwards to the gRPC backend service application. We would need the logs from the latter. |
These are all the logs available from the backend service in my first post. Also, I've tried to increase the heartbeat timeout to 10s and even 30 seconds, unfortunately it didn't help Agent log12:47:06.581 INFO [Agent.kt:307] - Version: unknown Release Date: unknown [main]
12:47:06.644 INFO [AgentOptions.kt:98] - proxyHostname: https://prometheus-proxy.service.net:443 [main]
12:47:06.644 INFO [AgentOptions.kt:102] - agentName: test-agent-1 [main]
12:47:06.646 INFO [AgentOptions.kt:106] - consolidated: false [main]
12:47:06.649 INFO [AgentOptions.kt:110] - scrapeTimeoutSecs: 15s [main]
12:47:06.650 INFO [AgentOptions.kt:114] - scrapeMaxRetries: 0 [main]
12:47:06.650 INFO [AgentOptions.kt:120] - chunkContentSizeKbs: 32768 [main]
12:47:06.650 INFO [AgentOptions.kt:124] - minGzipSizeBytes: 512 [main]
12:47:06.651 INFO [AgentOptions.kt:128] - overrideAuthority: [main]
12:47:06.651 INFO [AgentOptions.kt:133] - trustAllX509Certificates: false [main]
12:47:06.651 INFO [BaseOptions.kt:146] - adminEnabled: false [main]
12:47:06.652 INFO [BaseOptions.kt:152] - adminPort: 8093 [main]
12:47:06.652 INFO [BaseOptions.kt:158] - metricsEnabled: false [main]
12:47:06.652 INFO [BaseOptions.kt:170] - metricsPort: 8083 [main]
12:47:06.653 INFO [BaseOptions.kt:176] - transportFilterDisabled: true [main]
12:47:06.653 INFO [BaseOptions.kt:164] - debugEnabled: false [main]
12:47:06.653 INFO [BaseOptions.kt:182] - certChainFilePath: [main]
12:47:06.653 INFO [BaseOptions.kt:188] - privateKeyFilePath: [main]
12:47:06.654 INFO [BaseOptions.kt:194] - trustCertCollectionFilePath: dev_cert.pem [main]
12:47:06.654 INFO [AgentOptions.kt:146] - agent.scrapeTimeoutSecs: 15s [main]
12:47:06.654 INFO [AgentOptions.kt:147] - agent.internal.cioTimeoutSecs: 1m 30s [main]
12:47:06.654 INFO [AgentOptions.kt:148] - agent.internal.heartbeatCheckPauseMillis: 500 [main]
12:47:06.655 INFO [AgentOptions.kt:151] - agent.internal.heartbeatMaxInactivitySecs: 30 [main]
12:47:06.664 INFO [AgentPathManager.kt:52] - Proxy path /test-agent-1 will be assigned to http://127.0.0.1:9273/metrics with labels {} [main]
12:47:06.785 INFO [TlsUtils.kt:83] - Reading trustCertCollectionFilePath: "dev_cert.pem" [main]
12:47:06.820 INFO [AgentGrpcService.kt:132] - Creating gRPC stubs [main]
12:47:06.825 INFO [GrpcDsl.kt:75] - Creating connection for gRPC server at prometheus-proxy.service.net:443 using TLS (no mutual auth) [main]
12:47:06.886 INFO [Agent.kt:125] - Agent name: test-agent-1 [main]
12:47:06.886 INFO [Agent.kt:126] - Proxy reconnect pause time: 3s [main]
12:47:06.886 INFO [Agent.kt:127] - Scrape timeout time: 15s [main]
12:47:06.887 INFO [GenericService.kt:121] - Metrics service disabled [main]
12:47:06.887 INFO [GenericService.kt:129] - Zipkin reporter service disabled [main]
12:47:06.892 INFO [GenericService.kt:188] - Adding service Agent{agentId=, agentName=test-agent-1, proxyHost=prometheus-proxy.service.net:443, adminService=Disabled, metricsService=Disabled} [main]
12:47:06.912 INFO [GenericServiceListener.kt:29] - Starting Agent{agentId=, agentName=test-agent-1, proxyHost=prometheus-proxy.service.net:443, adminService=Disabled, metricsService=Disabled} [main]
12:47:06.913 INFO [GenericServiceListener.kt:30] - Running Agent{agentId=, agentName=test-agent-1, proxyHost=prometheus-proxy.service.net:443, adminService=Disabled, metricsService=Disabled} [main]
12:47:06.914 INFO [GenericService.kt:141] - All Agent services healthy [Agent test-agent-1]
12:47:06.929 INFO [AgentGrpcService.kt:163] - Connecting to proxy at prometheus-proxy.service.net:443 using TLS (no mutual auth)... [Agent test-agent-1]
12:47:07.934 INFO [AgentGrpcService.kt:169] - Connected to proxy at prometheus-proxy.service.net:443 using TLS (no mutual auth) [Agent test-agent-1]
12:47:08.307 INFO [AgentPathManager.kt:78] - Registered http://127.0.0.1:9273/metrics as /test-agent-1 with labels {} [Agent test-agent-1]
12:47:08.323 INFO [Agent.kt:244] - Heartbeat scheduled to fire after 30s of inactivity [DefaultDispatcher-worker-2]
12:48:12.701 WARN [ScrapeResults.kt:120] - fetchScrapeUrl() io.grpc.StatusException: INTERNAL: RST_STREAM closed stream. HTTP/2 error code: PROTOCOL_ERROR - http://127.0.0.1:9273/metrics [DefaultDispatcher-worker-9]
io.grpc.StatusException: INTERNAL: RST_STREAM closed stream. HTTP/2 error code: PROTOCOL_ERROR
at io.grpc.Status.asException(Status.java:547)
at io.grpc.kotlin.ClientCalls$rpcImpl$1$1$1.onClose(ClientCalls.kt:300)
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:564)
at io.grpc.internal.ClientCallImpl.access$100(ClientCallImpl.java:72)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:729)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:710)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
12:48:12.708 WARN [Agent.kt:209] - Cannot connect to proxy at prometheus-proxy.service.net:443 StatusException INTERNAL: RST_STREAM closed stream. HTTP/2 error code: PROTOCOL_ERROR [Agent test-agent-1]
12:48:12.711 INFO [Agent.kt:216] - Waited 0s to reconnect [Agent test-agent-1]
12:48:12.712 INFO [AgentGrpcService.kt:132] - Creating gRPC stubs [Agent test-agent-1]
12:48:12.717 INFO [GrpcDsl.kt:75] - Creating connection for gRPC server at prometheus-proxy.service.net:443 using TLS (no mutual auth) [Agent test-agent-1]
12:48:12.719 INFO [Agent.kt:153] - Resetting agentId [Agent test-agent-1]
12:48:12.719 INFO [AgentGrpcService.kt:163] - Connecting to proxy at prometheus-proxy.service.net:443 using TLS (no mutual auth)... [Agent test-agent-1]
12:48:13.483 INFO [AgentGrpcService.kt:169] - Connected to proxy at prometheus-proxy.service.net:443 using TLS (no mutual auth) [Agent test-agent-1]
12:48:13.833 INFO [AgentPathManager.kt:78] - Registered http://127.0.0.1:9273/metrics as /test-agent-1 with labels {} [Agent test-agent-1]
12:48:13.833 INFO [Agent.kt:244] - Heartbeat scheduled to fire after 30s of inactivity [DefaultDispatcher-worker-3]
12:49:42.688 WARN [ScrapeResults.kt:120] - fetchScrapeUrl() java.util.concurrent.CancellationException: Parent job is Cancelling - http://127.0.0.1:9273/metrics [DefaultDispatcher-worker-6]
java.util.concurrent.CancellationException: Parent job is Cancelling
at io.ktor.client.engine.UtilsKt$attachToUserJob$cleanupHandler$1.invoke(Utils.kt:99)
at io.ktor.client.engine.UtilsKt$attachToUserJob$cleanupHandler$1.invoke(Utils.kt:97)
at kotlinx.coroutines.InvokeOnCancelling.invoke(JobSupport.kt:1571)
at kotlinx.coroutines.JobSupport.invokeOnCompletionInternal$kotlinx_coroutines_core(JobSupport.kt:500)
at kotlinx.coroutines.JobSupport.invokeOnCompletion(JobSupport.kt:452)
at kotlinx.coroutines.Job$DefaultImpls.invokeOnCompletion$default(Job.kt:313)
at io.ktor.client.engine.HttpClientEngineKt.createCallContext(HttpClientEngine.kt:166)
at io.ktor.client.engine.HttpClientEngine$DefaultImpls.executeWithinCallContext(HttpClientEngine.kt:91)
at io.ktor.client.engine.HttpClientEngine$DefaultImpls.access$executeWithinCallContext(HttpClientEngine.kt:24)
at io.ktor.client.engine.HttpClientEngine$install$1.invokeSuspend(HttpClientEngine.kt:70)
at io.ktor.client.engine.HttpClientEngine$install$1.invoke(HttpClientEngine.kt)
at io.ktor.client.engine.HttpClientEngine$install$1.invoke(HttpClientEngine.kt)
at io.ktor.util.pipeline.DebugPipelineContext.proceedLoop(DebugPipelineContext.kt:79)
at io.ktor.util.pipeline.DebugPipelineContext.proceed(DebugPipelineContext.kt:57)
at io.ktor.util.pipeline.DebugPipelineContext.execute$ktor_utils(DebugPipelineContext.kt:63)
at io.ktor.util.pipeline.Pipeline.execute(Pipeline.kt:86)
at io.ktor.client.plugins.HttpSend$DefaultSender.execute(HttpSend.kt:118)
at io.ktor.client.plugins.api.Send$Sender.proceed(CommonHooks.kt:41)
at io.ktor.client.plugins.auth.AuthKt$Auth$2$2.invokeSuspend(Auth.kt:130)
at io.ktor.client.plugins.auth.AuthKt$Auth$2$2.invoke(Auth.kt)
at io.ktor.client.plugins.auth.AuthKt$Auth$2$2.invoke(Auth.kt)
at io.ktor.client.plugins.api.Send$install$1.invokeSuspend(CommonHooks.kt:46)
at io.ktor.client.plugins.api.Send$install$1.invoke(CommonHooks.kt)
at io.ktor.client.plugins.api.Send$install$1.invoke(CommonHooks.kt)
at io.ktor.client.plugins.HttpSend$InterceptedSender.execute(HttpSend.kt:96)
at io.ktor.client.plugins.api.Send$Sender.proceed(CommonHooks.kt:41)
at io.ktor.client.plugins.HttpRequestRetryKt$HttpRequestRetry$2$1.invokeSuspend(HttpRequestRetry.kt:296)
at io.ktor.client.plugins.HttpRequestRetryKt$HttpRequestRetry$2$1.invoke(HttpRequestRetry.kt)
at io.ktor.client.plugins.HttpRequestRetryKt$HttpRequestRetry$2$1.invoke(HttpRequestRetry.kt)
at io.ktor.client.plugins.api.Send$install$1.invokeSuspend(CommonHooks.kt:46)
at io.ktor.client.plugins.api.Send$install$1.invoke(CommonHooks.kt)
at io.ktor.client.plugins.api.Send$install$1.invoke(CommonHooks.kt)
at io.ktor.client.plugins.HttpSend$InterceptedSender.execute(HttpSend.kt:96)
at io.ktor.client.plugins.api.Send$Sender.proceed(CommonHooks.kt:41)
at io.ktor.client.plugins.HttpTimeoutKt$HttpTimeout$2$1.invokeSuspend(HttpTimeout.kt:175)
at io.ktor.client.plugins.HttpTimeoutKt$HttpTimeout$2$1.invoke(HttpTimeout.kt)
at io.ktor.client.plugins.HttpTimeoutKt$HttpTimeout$2$1.invoke(HttpTimeout.kt)
at io.ktor.client.plugins.api.Send$install$1.invokeSuspend(CommonHooks.kt:46)
at io.ktor.client.plugins.api.Send$install$1.invoke(CommonHooks.kt)
at io.ktor.client.plugins.api.Send$install$1.invoke(CommonHooks.kt)
at io.ktor.client.plugins.HttpSend$InterceptedSender.execute(HttpSend.kt:96)
at io.ktor.client.plugins.api.Send$Sender.proceed(CommonHooks.kt:41)
at io.ktor.client.plugins.HttpRedirectKt$HttpRedirect$2$1.invokeSuspend(HttpRedirect.kt:97)
at io.ktor.client.plugins.HttpRedirectKt$HttpRedirect$2$1.invoke(HttpRedirect.kt)
at io.ktor.client.plugins.HttpRedirectKt$HttpRedirect$2$1.invoke(HttpRedirect.kt)
at io.ktor.client.plugins.api.Send$install$1.invokeSuspend(CommonHooks.kt:46)
at io.ktor.client.plugins.api.Send$install$1.invoke(CommonHooks.kt)
at io.ktor.client.plugins.api.Send$install$1.invoke(CommonHooks.kt)
at io.ktor.client.plugins.HttpSend$InterceptedSender.execute(HttpSend.kt:96)
at io.ktor.client.plugins.api.Send$Sender.proceed(CommonHooks.kt:41)
at io.ktor.client.plugins.HttpCallValidatorKt$HttpCallValidator$2$2.invokeSuspend(HttpCallValidator.kt:112)
at io.ktor.client.plugins.HttpCallValidatorKt$HttpCallValidator$2$2.invoke(HttpCallValidator.kt)
at io.ktor.client.plugins.HttpCallValidatorKt$HttpCallValidator$2$2.invoke(HttpCallValidator.kt)
at io.ktor.client.plugins.api.Send$install$1.invokeSuspend(CommonHooks.kt:46)
at io.ktor.client.plugins.api.Send$install$1.invoke(CommonHooks.kt)
at io.ktor.client.plugins.api.Send$install$1.invoke(CommonHooks.kt)
at io.ktor.client.plugins.HttpSend$InterceptedSender.execute(HttpSend.kt:96)
at io.ktor.client.plugins.HttpSend$Plugin$install$1.invokeSuspend(HttpSend.kt:84)
at io.ktor.client.plugins.HttpSend$Plugin$install$1.invoke(HttpSend.kt)
at io.ktor.client.plugins.HttpSend$Plugin$install$1.invoke(HttpSend.kt)
at io.ktor.util.pipeline.DebugPipelineContext.proceedLoop(DebugPipelineContext.kt:79)
at io.ktor.util.pipeline.DebugPipelineContext.proceed(DebugPipelineContext.kt:57)
at io.ktor.client.plugins.RequestError$install$1.invokeSuspend(HttpCallValidator.kt:134)
at io.ktor.client.plugins.RequestError$install$1.invoke(HttpCallValidator.kt)
at io.ktor.client.plugins.RequestError$install$1.invoke(HttpCallValidator.kt)
at io.ktor.util.pipeline.DebugPipelineContext.proceedLoop(DebugPipelineContext.kt:79)
at io.ktor.util.pipeline.DebugPipelineContext.proceed(DebugPipelineContext.kt:57)
at io.ktor.client.plugins.SetupRequestContext$install$1.invokeSuspend$proceed(HttpRequestLifecycle.kt:40)
at io.ktor.client.plugins.SetupRequestContext$install$1.access$invokeSuspend$proceed(HttpRequestLifecycle.kt)
at io.ktor.client.plugins.SetupRequestContext$install$1$1.invoke(HttpRequestLifecycle.kt:40)
at io.ktor.client.plugins.SetupRequestContext$install$1$1.invoke(HttpRequestLifecycle.kt:40)
at io.ktor.client.plugins.HttpRequestLifecycleKt$HttpRequestLifecycle$1$1.invokeSuspend(HttpRequestLifecycle.kt:27)
at io.ktor.client.plugins.HttpRequestLifecycleKt$HttpRequestLifecycle$1$1.invoke(HttpRequestLifecycle.kt)
at io.ktor.client.plugins.HttpRequestLifecycleKt$HttpRequestLifecycle$1$1.invoke(HttpRequestLifecycle.kt)
at io.ktor.client.plugins.SetupRequestContext$install$1.invokeSuspend(HttpRequestLifecycle.kt:40)
at io.ktor.client.plugins.SetupRequestContext$install$1.invoke(HttpRequestLifecycle.kt)
at io.ktor.client.plugins.SetupRequestContext$install$1.invoke(HttpRequestLifecycle.kt)
at io.ktor.util.pipeline.DebugPipelineContext.proceedLoop(DebugPipelineContext.kt:79)
at io.ktor.util.pipeline.DebugPipelineContext.proceed(DebugPipelineContext.kt:57)
at io.ktor.util.pipeline.DebugPipelineContext.execute$ktor_utils(DebugPipelineContext.kt:63)
at io.ktor.util.pipeline.Pipeline.execute(Pipeline.kt:86)
at io.ktor.client.HttpClient.execute$ktor_client_core(HttpClient.kt:1393)
at io.ktor.client.statement.HttpStatement.fetchResponse(HttpStatement.kt:147)
at io.ktor.client.statement.HttpStatement.execute(HttpStatement.kt:68)
at com.github.pambrose.common.dsl.KtorDsl.get(KtorDsl.kt:85)
at io.prometheus.agent.AgentHttpService.fetchContent(AgentHttpService.kt:90)
at io.prometheus.agent.AgentHttpService.fetchContentFromUrl(AgentHttpService.kt:76)
at io.prometheus.agent.AgentHttpService.fetchScrapeUrl(AgentHttpService.kt:61)
at io.prometheus.agent.AgentGrpcService$readRequestsFromProxy$2$1$2.invokeSuspend(AgentGrpcService.kt:298)
at io.prometheus.agent.AgentGrpcService$readRequestsFromProxy$2$1$2.invoke(AgentGrpcService.kt)
at io.prometheus.agent.AgentGrpcService$readRequestsFromProxy$2$1$2.invoke(AgentGrpcService.kt)
at io.prometheus.Agent$run$connectToProxy$3$4.invokeSuspend(Agent.kt:186)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:101)
at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:589)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:832)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:720)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:707)
12:49:42.690 WARN [Agent.kt:209] - Cannot connect to proxy at prometheus-proxy.service.net:443 StatusException INTERNAL: RST_STREAM closed stream. HTTP/2 error code: PROTOCOL_ERROR [Agent test-agent-1]
12:49:42.691 INFO [Agent.kt:216] - Waited 0s to reconnect [Agent test-agent-1]
12:49:42.691 INFO [AgentGrpcService.kt:132] - Creating gRPC stubs [Agent test-agent-1]
12:49:42.692 INFO [GrpcDsl.kt:75] - Creating connection for gRPC server at prometheus-proxy.service.net:443 using TLS (no mutual auth) [Agent test-agent-1]
12:49:42.694 INFO [Agent.kt:153] - Resetting agentId [Agent test-agent-1]
12:49:42.695 INFO [AgentGrpcService.kt:163] - Connecting to proxy at prometheus-proxy.service.net:443 using TLS (no mutual auth)... [Agent test-agent-1]
12:49:43.439 INFO [AgentGrpcService.kt:169] - Connected to proxy at prometheus-proxy.service.net:443 using TLS (no mutual auth) [Agent test-agent-1]
12:49:43.790 INFO [AgentPathManager.kt:78] - Registered http://127.0.0.1:9273/metrics as /test-agent-1 with labels {} [Agent test-agent-1]
12:49:43.791 INFO [Agent.kt:244] - Heartbeat scheduled to fire after 30s of inactivity [DefaultDispatcher-worker-7] |
A correction about my previous message. After reading the link https://github.com/pambrose/prometheus-proxy you gave it is clear that backend application providing metrics behind firewall <----- Prometheus Agent behind firewall - - - uses gRPC- - -> Prometheus Proxy uses NGIX Ingress (outside firewall) <--- Prometheus server (not our worry here) (Whenever you said server you referred to the Prometheus proxy and not the Prometheus server). You have provided the Agent logs which is the gRPC client. You can see in its logs
Does the Prometheus proxy service that is using nginx ingress not have logs? It runs a gRPC server so it should have them. |
nginx can close connection after the time specified by |
Hi,
I'm using a tool that utilizes gRPC-Java for communication between the client and server. The server is located in an AWS EKS cluster and is accessible externally via NGINX ingress.
To configure it, I followed this guide: https://kubernetes.github.io/ingress-nginx/examples/grpc/
After deploying all components, I tested it with grpcurl and received a successful response.
Then, I configured communication between the agent and the server, and it also worked. However, I encountered an issue when the agent lost its connection for a period of time during communication
Below is a log message
Agent log
To rule out any potential errors with the app, I also tested it with a simpler configuration, where the server was deployed on a standard EC2 instance and made available to the web. In this setup, I didn’t encounter any problems; everything worked as expected. So, it seems that the issue lies somewhere in the configuration of the NLB used by NGINX ingress or with NGINX ingress itself
Below is a visualization of how often the connection is dropped

The text was updated successfully, but these errors were encountered: