-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(#144): additional configuration for cht-user-management #146
Conversation
@@ -0,0 +1,7 @@ | |||
scrape_configs: | |||
- job_name: cht-user-management-metrics | |||
metrics_path: /metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default path is /metrics
so we can drop this for simplicity:
metrics_path: /metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
per other comment, still seeing metrics_path: /metrics
in there - so assume local changes weren't pushed
@kennsippell - yay! Love the simplicity of the request here - just enough, not too much. Thanks for opening the PR! That said there's a disconnect in that it assumes metrics are on
The reason is that the two URLs introduce a lot of complexity in how you monitor them. Further, Prometheus assumes that you're going to use (sorry - forgot to include this summary on my request for changes) |
I didn't realise this was the best practice for prometheus, but you're right. I'll make that change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey again! I wanted to actually test the dashboard you created in the PR. Once installed, the dashboard works great - nice work!
That said, I had some issues getting the compose file to work with the new dashboard json file and made a suggestion on how to simplify things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello again! I did some more testing to ensure two things:
- system scales with N instances
- all panels are getting data
system scales with N instances
For # 2 - works great! I stood up a second UMT instance on port 3501
and put it in the scrape.yml
like this:
scrape_configs:
- job_name: cht-user-management-metrics
metrics_path: /fastify-metrics
static_configs:
- targets:
- 172.17.0.1:3500
- 172.17.0.1:3501
And I was able to see the two discretely in watchdog:
However, this made me realize that we're going to be editing the scrape.yml
by definition to add which ever instance we want to monitor. This means if there's a change to it upstream (like metrics_path: /fastify-metrics
-> metrics_path: /metrics
) there will be a conflict in this file and you won't be able to do a git pull origin
or similar call.
I suggest we move scrape.yml
file to be something like ./exporters/cht-user-management/scrape-dist.yml
and then we instruct people to copy it.
all panels are getting data
There's a number of panels that are not getting data. The top half is missing data for:
- Transferred
- Transfer Rate
- Number of Open connections
On the bottom half:
- Response Latency (90th Percentile), Top 5 request Duration and Top 5 request count all only ever showed
/fastify-metrics
. In a worst case this could mean that all the stats that the/fastify-metrics
URL is gathering ONLY show data for the/fastify-metrics
endpoint? In a best case it's just these 3 panels. We should double check that all these stats monitor the entire holistic UMT app and not just the one Fastify endpiont - Top Error Duration has no data - this may be correct because there's no errors?
- Top Error Count has no data - this may be correct because there's no errors?
- Response Size has no data
I had been proposing here
Ready here medic/cht-user-management#278
There appears to be a bug in fastify-metrics around default argument values. Workaround is that we need to explicitly set defaults. I also think we were setting |
Ah - sure enough - we do indeed tell folks to edit the revision controlled files. Ok - let's fix this another time then. OK to leave as is for now! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you have local changes not yet pushed! Spot checking, I see nothing has changed.
@@ -0,0 +1,7 @@ | |||
scrape_configs: | |||
- job_name: cht-user-management-metrics | |||
metrics_path: /metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
per other comment, still seeing metrics_path: /metrics
in there - so assume local changes weren't pushed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking good! Just one quick file name change to match things up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is good to go! Thanks for iterating with me.
We should merge this and the other UMT PR!
🎉 This PR is included in version 1.18.0 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
#144
cc @ernestoteo since he was working on UMT grafana monitoring