Skip to content

Conversation

@kanecko
Copy link
Contributor

@kanecko kanecko commented Nov 15, 2025

To ease and accelerate the PR submission, please use the template below to provide all requested information.
You can find guidelines on which docker image to use in the Rockstor's documentation,
as well as examples of proper formatting in other rock-ons (here
and here)

Fixes #483 .

General information on project

This pull request proposes to add a new rock-on for the following project:

  • name: Observability Starter Kit (please help me with the name)
  • website: (i have no idea what to put here, nor in the JSON itself)
  • description: An observability stack featuring OpenTelemetry (Collector contrib), VictoriaMetrics, and Grafana, that makes it easier to get started with open-source monitoring on Rockstor.

Information on docker image

Checklist

  • Passes JSONlint validation
  • Entry added to root.json in alphabetical order (for new rock-on only)
  • "description" object lists and links to the docker image used
  • "description" object provides information on the image's particularities (advantage over another existing rock-on for the same project, for instance)
  • "website" object links to project's main website

This JSON file defines an observability stack including OpenTelemetry, VictoriaMetrics, and Grafana, detailing their configurations and links.
@kanecko
Copy link
Contributor Author

kanecko commented Nov 15, 2025

Since the setup for this rock-on is very involved, I wonder, if I should first create a write-up for it? And afterwards you can review both the write-up as well as this rock-on together? Within the write-up, I would also create a starter Grafana dashboard JSON file for our Rockstor users. The dashboard would also be subject to review and I would love to publish it on Grafana Dashboards once it is good enough.

Would that make sense?
I suppose, the "website" JSON field could be linked to the write-up guide?

Otherwise, here is the a try at explaining what needs to be done.

Pre-install steps:

  1. create group "grafana" with custom GID, e.g. 472
  2. create user "grafana" with custom UID, e.g. 472, assign group "grafana" to it
  3. create user "opentelemetry" with custom UID, e.g. 10001, and assign some group to it (e.g. create group opentelemetry 10001)
  4. create three shares: osk-grafana, osk-victoria-metrics, osk-opentelemetry-collector
  5. create config.yaml with the following contents in /mnt2/osk-opentelemetry-collector: https://gist.github.com/kanecko/cfaaf349c26e4602878e4a5b82bd9730
  6. chown -R 472:472 /mnt2/osk-grafana
  7. chown -R 10001:10001 /mnt2/osk-opentelemetry-collector

Install rock-on.

Post-install steps:

  1. Open webUI
  2. Login with admin:admin
  3. On the left-side menu find and click Connections > Data sources
  4. Add new data source
  5. Find "VictoriaMetrics" in the list (5th place)
  6. Set HTTP URL to "http://osk-victoria-metrics:8428"
  7. Save & test
  8. On the left-side menu find and click Dashboards
  9. Add new dashboard > Add visualization > victoriametrics >
  10. Configuration:
    Metric: "system.memory.utilization"
    On the right-side menu scroll down to Standard options > Unit > select Percent(0.0-1.0)
    Press the "Run queries" button
image

Whole conf sshot:
image

@Hooverdan96 Hooverdan96 added the needs review Test install, function, on / off behaviour, all links / info. label Nov 16, 2025
@phillxnet
Copy link
Member

@kanecko Nice.

Another quick note: we need make no mention of netdata in the description I think. This will also help to keep it to-the-point.

@kanecko
Copy link
Contributor Author

kanecko commented Nov 16, 2025

Duly noted. I will delete netdata from the description. I thought that I read somewhere that we are supposed to give a comparison to rock-ons that solve the same kind of problem.

@phillxnet
Copy link
Member

phillxnet commented Nov 16, 2025

@kanecko Re

I thought that I read somewhere that we are supposed to give a comparison to rock-ons that solve the same kind of problem.

Likely the guide/template that pops up as a rockon-registry PR template. Intended to given context for the proposed pull request.

[EDIT]

"description" object provides information on the image's particularities (advantage over another existing rock-on for the same project, for instance)

@phillxnet
Copy link
Member

phillxnet commented Nov 16, 2025

@kanecko Re:

480 represents the "docker" group on my machine. Without it we will run into docker permission issues.

Obviously the docker GID value may not be hard-coded. Do you have any suggestions on how to get the local docker GID in there, without (or with minimal) user intervention?

This is touching on what has been bothering us re required extensions to the Rock-on backend. Look to my forgejo-runner rock-on for one approach (work-around wise). We for now have to give users the run-around. But in an issue from @Hooverdan96 there is a neat suggestion regarding replacement variables within Rock-on definitions. Akin to bash variables. Take a look at:

This is the issue I think: [Rockon]: Feature Request - add global variables to reduce duplicate UI entries #2935

[EDIT]: @Hooverdan96 quote form above: "A few Variables could be defined by default, e.g. the host ip/name - I assume those could be pulled directly from what already exists."

We could then have for example some 'special' variables such as hostname which is another tricky one that getting folks to by-hand enter is a little lacking on our side of things. Another could be the docker group that, as you say, is install dependant but already used (forgejo-runner) and required (this proposed rock-on project of yours). Some of these may not fit to @Hooverdan96 proposal centrally but just as an example issue re the mentioned, but not yet existing Rock-ons V2 Milestone.

EDIT: [ "--user", "10001:docker" ] does not work

This is covered in a recent discussion of ours here: rockstor/rockstor-core#3029 (comment)

@kanecko
Copy link
Contributor Author

kanecko commented Nov 16, 2025

I will replace "--user" with env vars for USER_UID and USER_GID. I think using your workaround will solve this issue.
I will then also take a look, whether Grafana can be similarly configured to use something other than 472.

@Hooverdan96
Copy link
Member

I wonder, if I should first create a write-up for it. Within the write-up, I would also create a starter Grafana dashboard JSON file for our Rockstor users.

I think having a write-up for this would be really great! Not sure you need to create it first, but I think once the above discussions are ironed out, you could then create the write-up for the documentation in the rockstor-doc repo. And, as you suggested, finally link to the documentation in the Rockon description (and there are a couple of Rockons that do the same)

@kanecko
Copy link
Contributor Author

kanecko commented Nov 18, 2025

I have updated the JSON and my 2nd post (installation guide).

After some research and testing, I've come to the realization that both OTel and Grafana official images have baked-in UID/GIDs.
Possible solution: create my own custom Dockerfiles for both projects, or use some existing published docker image.
However, I refuse to go down this road due to additional maintenance effort.
Possible workaround: I've used uid: -1 in the rockon, so that the binary will be run with the volume's user.
This bars us from ingesting docker stats, but since this is a "starter kit", I think it's acceptable, if I only support ingestion of hostmetrics out-of-the-box.

Removed Docker socket volume mount from options.
@Hooverdan96
Copy link
Member

@phillxnet, @kanecko would it be better to leave the openTelemetry website as the link in the Rockon itself as it's the "underpinning" of this Rockon, and put the (eventual) link to the documentation on the Rockstor website into the Rockon description instead? I could try to argue either way, since this one is an assembly of different technologies, not just a multi-container Rockon where one component "dominates" the whole thing ...
Your thouhts?

@phillxnet
Copy link
Member

@kanecko & @Hooverdan96 Re:

... and put the (eventual) link to the documentation on the Rockstor website into the Rockon description instead?

Yes, we have recent precedent for this that I would like to propose as a standard/norm going forward:

See: Bareos Backup Server: https://github.com/rockstor/rockon-registry/blob/master/bareos-backup-server.json

I.e. we have a set link text target _blank of Rock-on guide in Description, ideally in a prominent position: e.g. end of line.

<a href='https://rockstor.com/docs/interface/docker-based-rock-ons/bareos-backup-server.html' target='_blank'>Rock-on guide</a>

Moving to such an arrangement can later enable our rockon-validator to re-write/move this to an element of its own once we have one. If that is the way we want to go when the time comes.

@Hooverdan96
Copy link
Member

Hooverdan96 commented Nov 19, 2025

we have a set link text target _blank of Rock-on guide in Description, ideally in a prominent position: e.g. end of line.

Agreed, what do you suggest should be the Rockon title link? Like what @kanecko originally proposed opentelemetry.io or something else?

@phillxnet
Copy link
Member

@Hooverdan96 & @kanecko Re:

... what do you suggest should be the Rockon title link? Like what @kanecko originally proposed opentelemetry.io or something else?

I think the opentelemetry.io link works - it is after all how all this is tied together.

This Rock-on is shaping up to be a kind to sand-pit for trying out what could end-up being our Dashboard replacement - only not reliant on docker of course.

@kanecko
Copy link
Contributor Author

kanecko commented Nov 21, 2025

I've added opentelemetry.io and the rock-on guide link with the would-be URL.

@Hooverdan96
Copy link
Member

Hooverdan96 commented Dec 16, 2025

Also see some comments I left for accompanying documentation in this PR.

Install test:

Links in description work as expected:
image

image see comment in rockstor-doc PR about harmonizing share name instructions and example here ... image image image

After first login prompted to change password:
image

Creating Data Source (see comments over in Rockstor documentation PR)
Importing Dashboard. Not all tiles are working it seems, but that might also be related to the fact that I’ve set up a VM to run this (it does have a network adapter configured). Any idea?

image

Following the tutorial on adding a new metric:

image

How would I investigate the network issue on my install to ensure that this is just a one-off?

Other than that I really like it!!! Especially in conjunction with the documentation

@phillxnet
Copy link
Member

@kanecko Apologies for the now required rebase and ideally a squash of all your commits on this pull request. We are, bit by bit, normalising our Rock-on format and this has now been done on the root.json. Plus, ideally, the initial presentation pre-review, should have only a single commit.

@phillxnet
Copy link
Member

@kanecko Re:

Possible workaround: I've used uid: -1 in the rockon, so that the binary will be run with the volume's user.
This bars us from ingesting docker stats,

Keep an eye on the following pending final testing pull request earmarked for our first 5.5.X testing phase rpm:

A proposal that includes a special gid value of -2 that instigates the substitution of the system's docker gid during Rock-on install. Otherwise it is a copy of our existing uid but for groups.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs review Test install, function, on / off behaviour, all links / info.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Observability Starter Kit

3 participants