-
Notifications
You must be signed in to change notification settings - Fork 247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add id and status labels to pipeline and job metrics #455
base: main
Are you sure you want to change the base?
Conversation
Why joining on the |
@maciej-gol Do you have a way to create a dashboard with historical data using the curernt implementation, I would like to hear it. We need such dashaboard and I did not find any way to get a list of historic pipelines/jobs with option to filter. |
I understand your issue, as I'm facing it, too. Having said that I don't
believe adding labels will solve it (on its' own). Why? You can already
figure out what pipeline relevant metrics refer to by looking up the
pipeline_id metric. Adding labels will only duplicate the data exported
whilst opening you to the problem of growing metrics.
First of all, you need to tweak the exporter to crawl all the pipelines,
not only the most recent ones. I might be mistaken, but your MR only
tackles the labels, not the crawling.
Secondly, growing metrics issue. The problem with the prometheus library is
that it doesn't forget metrics' labels once observed. This is important,
because in infinity, the exporter will present Prometheus with ALL the jobs
ever seen, on every scrape. That's the same as just querying your Gitlab DB
directly. You could restart the exporter, but things get messy when you use
redis for HA.
Having said all of this, I believe this exporter is not suitable to monitor
the health of your GCI system when you allow more than one pipeline per
ref. In such, I'm currently opting to building the state of ALL of the
pipeline by querying webhooks data (although that's not all).
I share your need of tracking ALL running pipeline, but I'm worried this
exporter would need architectural changes to work to address this need.
sob., 28 maj 2022, 13:24 użytkownik ErezArbell ***@***.***>
napisał:
… Why joining on the gitlab_ci_pipeline_id is not sufficient? You can
lookup how it works in the example dashboards.
@maciej-gol <https://github.com/maciej-gol>
In the example dashboards you can see only the latest pipeline/job. You
cannot see historic data.
Example for something that I would like to have: to show all runs of a
scpecific job name during the last week and so you can see when it started
to fail.
Such things cannot be done without having the extra labels this PR adds.
Do you have a way to create a dashboard with historical data using the
curernt implementation, I would like to hear it. We need such dashaboard
and I did not find any way to get a list of historic pipelines/jobs with
option to filter.
—
Reply to this email directly, view it on GitHub
<#455 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOGQTPLBFFUIPZA2ZGWDOLVMH65HANCNFSM5VT26MXA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Thank you @maciej-gol for the insightful comments.
You are correct. However, this MR does create an improvement with collecting the data: the way it works is that it always publish only the latest job (for example) that have that same set of labels-values set. So In the current implementation, if a new pipeline starts ont he same ref before the old one ends then only the job from thenew pipeline will be published. This MR add the pipeline_id and job_id labels and they are unique. So the jobs from the older pipelines will still be published.
You have a good point here. Now that I think about it, it is indeed what is expected to happen, but it is not what I see when I look at the '/metrics' endpoint. Anyway, it is a good point.
I agree. This is not the suitable tool. This was the closest I found so I thought to use it. It is strage that no such tools is avaiable for GitLab, which is a popular commercial product. BYW, what is "GCI system"? |
Since I've been working quite a lot with Gitlab here at Codility, I've
started to use GCI in place of Gitlab CI, as it gets tiresome writing the
full name over and over :D
sob., 28 maj 2022, 14:24 użytkownik ErezArbell ***@***.***>
napisał:
… Thank you @maciej-gol <https://github.com/maciej-gol> for the insightful
comments.
I might be mistaken, but your MR only tackles the labels, not the crawling.
You are correct. However, this MR does create an improvement with
collecting the data: the way it works is that it always publish only the
latest job (for example) that have that same set of labels-values set. So
In the current implementation, if a new pipeline starts ont he same ref
before the old one ends then only the job from thenew pipeline will be
published. This MR add the pipeline_id and job_id labels and they are
unique. So the jobs from the older pipelines will still be published.
in infinity, the exporter will present Prometheus with ALL the jobs ever
seen, on every scrape
You have a good point here. Now that I think about it, it is indeed what
is expected to happen, but it is not what I see when I look at the
'/metrics' endpoint. Anyway, it is a good point.
Having said all of this, I believe this exporter is not suitable to
monitor the health of your GCI system when you allow more than one pipeline
per ref
...
I share your need of tracking ALL running pipeline, but I'm worried this
exporter would need architectural changes to work to address this need.
I agree. This is not the suitable tool. This was the closest I found so I
thought to use it.
I understand that this PR will not be pulled. I will, however, leave this
PR open since I would like to get a response from the repo owner, maybe he
will have a suggestion.
It is strage that no such tools is avaiable for GitLab, which is a popular
commercial product.
BYW, what is "GCI system"?
—
Reply to this email directly, view it on GitHub
<#455 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOGQTJ6HXPU2LJZHORPRKLVMIGAVANCNFSM5VT26MXA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@ErezArbell since your use-case is monitoring general ratio of successes of your jobs (per ref, perhaps), I believe implementing job hooks to simply store success/failures counters would be enough, without opening yourself to the growing metrics problem. You could export job status counters, and just expose it via gitlab_ci_pipeline_job_status_counter{job_name, ref, project, status}. Fail rate would be It might solve my problem (tracking all pending jobs), but I would need to give it more thought. What do you think? |
👋 Hi everyone! this issue is very interesting. We are having the same problem to be able to track the final status of all the jobs and their evolution, since as @ErezArbell comments, it only reports the status of the last job. I'm going to try running the app with the changes incorporated by @ErezArbell and see if it fixes our problem. I look forward to the resolution of this issue 🦊 |
@tinchoram, I added to the "quickstart" example two dashboards that I created to use those changes.
|
@maciej-gol, I do not need the ratios. I need to see the history of pipelines and jobs in a table, with options to filter. |
f1d1bc5
to
718e730
Compare
See details in issue 453.
This can be helpful to better filter the queries and also to present more than the last pipeline/job in the dashboard.