Skip to content

Conversation

@lklimek
Copy link
Contributor

@lklimek lklimek commented Oct 24, 2025

Issue being fixed or feature implemented

When testing rs-dapi, we want to see some metrics generated in prometheus format.

What was done?

Extended dashmate-generated Docker containers with labels that enable Prometheus metrics.

Automatically start envoy admin endpoint when envoy metrics are enabled.

How Has This Been Tested?

Local devnet

Breaking Changes

None

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have added "!" to the title and described breaking changes in the corresponding section if my code contains any
  • I have made corresponding changes to the documentation if needed

For repository code-owners and collaborators only

  • I have assigned this pull request to a milestone

Summary by CodeRabbit

  • New Features

    • Added Prometheus metrics support with automatic Docker service discovery across platform services
    • Automatic Envoy admin endpoint configuration when metrics are enabled
    • Services now advertise config name labels for metrics filtering and identification
  • Documentation

    • Added Prometheus setup guide with local monitoring examples and configuration
  • Tests

    • Added unit tests for metrics port publishing and Envoy template rendering

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 24, 2025

Walkthrough

Added Prometheus metrics discovery and scraping support to Dashmate services via Docker labels. Removed validation requiring gateway admin enablement for metrics. Added documentation for local Prometheus monitoring setup and updated service documentation to reflect automatic admin endpoint enablement when metrics are active.

Changes

Cohort / File(s) Summary
Docker Compose Service Labels
packages/dashmate/docker-compose.rate_limiter.metrics.yml, packages/dashmate/docker-compose.yml
Added four Prometheus-related labels to services: org.dashmate.config.name, prometheus.io/scrape, prometheus.io/port, prometheus.io/path. Enables automatic metrics discovery and scraping configuration without altering control flow.
Configuration & Validation Logic
packages/dashmate/src/doctor/analyse/analyseConfigFactory.js
Removed validation check that produced a HIGH-severity error when gateway metrics were enabled but gateway admin was disabled, reflecting the new automatic admin endpoint enablement.
Service Documentation
packages/dashmate/docs/config/gateway.md, packages/dashmate/docs/services/gateway.md, packages/dashmate/docs/services/index.md
Updated metrics section documentation to clarify that Dashmate automatically enables Envoy admin endpoint when metrics are on. Added information about org.dashmate.config.name label and link to Prometheus configuration guide.
Prometheus Configuration Guide
packages/dashmate/docs/prometheus/README.md, packages/dashmate/docs/prometheus/docker-compose.yml, packages/dashmate/docs/prometheus/prometheus.yml
Introduced new Prometheus monitoring documentation and setup examples. Includes Docker Compose configuration with docker-socket-proxy and Prometheus service, plus scrape configuration with Docker service discovery and relabeling rules for metrics collection.
Unit Tests
packages/dashmate/test/unit/templates/dynamicCompose.spec.js, packages/dashmate/test/unit/templates/envoyTemplate.spec.js
Added test coverage for rs-dapi metrics port publishing behavior in dynamic Compose generation and Envoy admin cluster configuration when metrics are enabled with admin disabled.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Rationale for removal: Verify that automatic admin endpoint enablement in Envoy configuration (tested in envoyTemplate.spec.js) adequately replaces the removed validation and maintains intended behavior.
  • Label consistency: Confirm Prometheus labels across all services follow the correct naming convention and port/path mappings.
  • Prometheus configuration logic: Review the scrape config relabeling rules in prometheus.yml to ensure they correctly map Docker labels to Prometheus metadata.
  • Test coverage completeness: Verify that test cases cover both enabled and disabled metrics scenarios.

Poem

🐰✨ A rabbit hops through metrics gleams,
Prometheus dreams and data streams,
Admin endpoints hop just right,
Docker labels glowing bright,
Dashmate's dashboard shines tonight! 🎯📊

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "add prometheus service discovery labels" accurately reflects the primary technical change in the changeset. The main modifications are additions of Prometheus-related labels (prometheus.io/scrape, prometheus.io/port, prometheus.io/path, and org.dashmate.config.name) to multiple services in docker-compose.yml and docker-compose.rate_limiter.metrics.yml. While the PR also includes supporting changes such as removing a validation check, adding comprehensive Prometheus documentation, and introducing tests, the title correctly captures the core feature being implemented. The title is concise, clear, and specific enough that a teammate reviewing commit history would understand the primary change without being overly detailed.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/dashmate-prometheus-labels

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@lklimek lklimek changed the base branch from v2.1-dev to v2.2-dev October 27, 2025 11:37
@lklimek lklimek force-pushed the feat/dashmate-prometheus-labels branch from e7aac7d to 4e1c115 Compare October 28, 2025 11:56
@lklimek lklimek marked this pull request as ready for review October 29, 2025 10:15
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
packages/dashmate/test/unit/templates/envoyTemplate.spec.js (1)

20-22: Consider more robust YAML parsing for assertions.

String matching with include() is fragile and may fail if YAML formatting changes (indentation, line breaks, etc.). Consider parsing the YAML and asserting on the structure.

+import yaml from 'js-yaml';
+
 describe('envoy template', () => {
   it('should render admin interface when metrics are enabled even if admin is disabled', () => {
     const getBaseConfig = getBaseConfigFactory(HomeDir.createTemp());
     const config = getBaseConfig();
 
     config.set('platform.gateway.metrics.enabled', true);
     config.set('platform.gateway.admin.enabled', false);
 
     const renderTemplate = renderTemplateFactory();
     const renderServiceTemplates = renderServiceTemplatesFactory(renderTemplate);
     const renderedConfigs = renderServiceTemplates(config);
 
     const envoyConfig = renderedConfigs['platform/gateway/envoy.yaml'];
+    const parsed = yaml.load(envoyConfig);
+    
+    const adminCluster = parsed.static_resources.clusters.find(c => c.name === 'admin');
+    expect(adminCluster).to.exist;
+    expect(adminCluster.load_assignment.endpoints[0].lb_endpoints[0].endpoint.address.socket_address.address).to.equal('127.0.0.1');
+    expect(adminCluster.load_assignment.endpoints[0].lb_endpoints[0].endpoint.address.socket_address.port_value).to.equal(9901);
-    expect(envoyConfig).to.include('cluster_name: admin');
-    expect(envoyConfig).to.include('address: 127.0.0.1');
-    expect(envoyConfig).to.include('port_value: 9901');
   });
 });
packages/dashmate/test/unit/templates/dynamicCompose.spec.js (2)

28-28: Clarify the purpose of checking for ':0'.

The assertion expect(rsDapiBlock).to.not.include(':0') is unclear. What specific configuration are you checking for? Consider adding a comment or using a more explicit assertion.

If this is checking that no port mapping exists with port 0, consider making it explicit:

     expect(rsDapiBlock).to.not.include('ports:');
-    expect(rsDapiBlock).to.not.include(':0');
+    // Verify no port mappings are present (no format like 'host:port:container')
+    expect(rsDapiBlock).to.not.match(/\d+:\d+:\d+/);

41-42: Verify port exposure format consistency.

Line 41 checks for the full port mapping format, but line 42 checks for a port entry that might match other content. Consider whether both assertions are necessary or if line 42 should be more specific.

     expect(rsDapiBlock).to.include('ports:\n      - 127.0.0.1:29091:29091');
-    expect(rsDapiBlock).to.include('- 29091');
+    // The port should also be listed in the expose section
+    expect(rsDapiBlock).to.match(/expose:\s+- 29091/);
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a8994c7 and 62337dd.

⛔ Files ignored due to path filters (3)
  • .yarn/cache/fsevents-patch-19706e7e35-10.zip is excluded by !**/.yarn/**, !**/*.zip
  • packages/dashmate/templates/dynamic-compose.yml.dot is excluded by !**/*.dot
  • packages/dashmate/templates/platform/gateway/envoy.yaml.dot is excluded by !**/*.dot
📒 Files selected for processing (8)
  • packages/dashmate/docker-compose.rate_limiter.metrics.yml (1 hunks)
  • packages/dashmate/docker-compose.yml (4 hunks)
  • packages/dashmate/docs/config/gateway.md (1 hunks)
  • packages/dashmate/docs/services/gateway.md (1 hunks)
  • packages/dashmate/docs/services/index.md (1 hunks)
  • packages/dashmate/src/doctor/analyse/analyseConfigFactory.js (0 hunks)
  • packages/dashmate/test/unit/templates/dynamicCompose.spec.js (1 hunks)
  • packages/dashmate/test/unit/templates/envoyTemplate.spec.js (1 hunks)
💤 Files with no reviewable changes (1)
  • packages/dashmate/src/doctor/analyse/analyseConfigFactory.js
🧰 Additional context used
📓 Path-based instructions (1)
packages/**/**/*.{js,ts,jsx,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

packages/**/**/*.{js,ts,jsx,tsx}: Adhere to ESLint with Airbnb/TypeScript configs for JS/TS code
Use camelCase for JS/TS variables and functions
Use PascalCase for JS/TS classes
Prefer kebab-case filenames within JS packages

Files:

  • packages/dashmate/test/unit/templates/envoyTemplate.spec.js
  • packages/dashmate/test/unit/templates/dynamicCompose.spec.js
🧬 Code graph analysis (2)
packages/dashmate/test/unit/templates/envoyTemplate.spec.js (3)
packages/dashmate/configs/defaults/getBaseConfigFactory.js (1)
  • getBaseConfigFactory (16-480)
packages/dashmate/src/templates/renderTemplateFactory.js (1)
  • renderTemplateFactory (8-30)
packages/dashmate/src/templates/renderServiceTemplatesFactory.js (1)
  • renderServiceTemplatesFactory (7-37)
packages/dashmate/test/unit/templates/dynamicCompose.spec.js (3)
packages/dashmate/configs/defaults/getBaseConfigFactory.js (1)
  • getBaseConfigFactory (16-480)
packages/dashmate/src/templates/renderTemplateFactory.js (1)
  • renderTemplateFactory (8-30)
packages/dashmate/src/templates/renderServiceTemplatesFactory.js (1)
  • renderServiceTemplatesFactory (7-37)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Build Docker images (RS-DAPI, rs-dapi, rs-dapi) / Build RS-DAPI image
  • GitHub Check: Build Docker images (Drive, drive, drive-abci) / Build Drive image
  • GitHub Check: Build Docker images (DAPI, dapi, dapi) / Build DAPI image
  • GitHub Check: Build Docker images (Dashmate helper, dashmate-helper, dashmate-helper) / Build Dashmate helper image
  • GitHub Check: Build JS packages / Build JS
  • GitHub Check: Rust crates security audit
🔇 Additional comments (8)
packages/dashmate/docs/services/index.md (1)

122-122: LGTM! Clear documentation of the config name label.

The addition clearly explains the purpose of the org.dashmate.config.name label and provides useful context for multi-node scenarios where distinguishing instances is important.

packages/dashmate/docs/config/gateway.md (1)

57-57: LGTM! Clear explanation of the admin/metrics interaction.

The documentation clearly explains that the admin endpoint is automatically enabled for metrics proxying while clarifying the exposure behavior based on the admin setting.

packages/dashmate/docker-compose.yml (4)

57-60: LGTM! Prometheus labels added correctly to drive_abci.

The labels follow Prometheus service discovery conventions and properly reference environment variables with appropriate defaults.


118-121: LGTM! Prometheus labels added correctly to drive_tenderdash.

The labels are consistent with other services and properly configured.


207-210: LGTM! Prometheus labels added correctly to rs_dapi.

The labels follow the same pattern as other services and are properly configured.


251-254: LGTM! Prometheus labels added correctly to gateway.

The labels are consistent with other services and properly configured for Prometheus service discovery.

packages/dashmate/docker-compose.rate_limiter.metrics.yml (1)

15-18: LGTM! Prometheus labels added correctly to rate limiter metrics.

The labels are consistent with the main docker-compose.yml changes and follow the same pattern for Prometheus service discovery.

packages/dashmate/docs/services/gateway.md (1)

264-264: LGTM! Clear documentation of admin/metrics behavior.

The note clearly explains the automatic admin endpoint enablement and exposure behavior, consistent with the configuration documentation.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
docs/prometheus/docker-compose.yml (1)

5-5: Pin container image versions instead of using latest tag.

Using latest image tags makes deployments non-reproducible and can introduce unexpected breaking changes when new image versions are released. For documentation examples, pinning specific versions ensures consistent behavior across different deployments and time periods.

Apply this diff to pin specific versions:

-    image: tecnativa/docker-socket-proxy:latest
+    image: tecnativa/docker-socket-proxy:0.5.0
-    image: prom/prometheus:latest
+    image: prom/prometheus:v2.52.0

Consider documenting which versions were tested with this setup, or providing a note on how to find the latest stable versions.

Also applies to: 17-17

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 62337dd and 57a0a67.

📒 Files selected for processing (3)
  • docs/prometheus/README.md (1 hunks)
  • docs/prometheus/docker-compose.yml (1 hunks)
  • docs/prometheus/prometheus.yml (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • docs/prometheus/README.md
🧰 Additional context used
📓 Path-based instructions (1)
docs/**

📄 CodeRabbit inference engine (AGENTS.md)

Store project documentation in docs/

Files:

  • docs/prometheus/prometheus.yml
  • docs/prometheus/docker-compose.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Build Docker images (Drive, drive, drive-abci) / Build Drive image
  • GitHub Check: Build Docker images (DAPI, dapi, dapi) / Build DAPI image
  • GitHub Check: Build Docker images (RS-DAPI, rs-dapi, rs-dapi) / Build RS-DAPI image
  • GitHub Check: Build Docker images (Dashmate helper, dashmate-helper, dashmate-helper) / Build Dashmate helper image
  • GitHub Check: Build JS packages / Build JS
  • GitHub Check: Rust crates security audit
🔇 Additional comments (5)
docs/prometheus/docker-compose.yml (2)

19-19: Clarify network_mode: host design decision.

Using network_mode: host makes the Prometheus container bind directly to the host network, which means Prometheus will listen on port 9080 across all host interfaces (not just the container's network). This can affect portability and security posture in multi-container environments.

Verify this is the intended behavior; if Prometheus should only be accessible locally, consider using bridge networking with explicit port bindings like ports: ["127.0.0.1:9080:9080"].


1-33: Good security practices for Docker socket access.

The docker_socket_proxy setup correctly implements principle of least privilege by:

  • Mounting the Docker socket read-only
  • Setting minimal environment permissions (CONTAINERS=1, NETWORKS=1)
  • Binding to localhost only (127.0.0.1:2375)

This is a solid foundation for local service discovery.

docs/prometheus/prometheus.yml (3)

20-24: Address replacement hardcodes 127.0.0.1 for all targets.

The relabeling rule replaces the scrape address to always use 127.0.0.1:<port>. This assumes all services expose metrics on localhost, which may not hold for services configured with different network modes or multi-host setups.

Verify that all Dashmate services are configured to be reachable on 127.0.0.1 on their respective prometheus.io/port labels. If some services expose metrics differently, you may need conditional relabeling or additional configuration.


14-30: Relabeling rules assume source labels are present.

The relabeling rules for prometheus.io/path, prometheus.io/port, org.dashmate.service_title, and org.dashmate.config_name use regex-based extraction without defaults or fallback behavior. If any of these labels are missing from a container, the rules will silently fail to apply, potentially leaving targets misconfigured or unlabeled.

Verify that all services intended for scraping include the required Docker labels, or add explicit relabeling rules with defaults (e.g., using target_label without source to set defaults).


1-30: Configuration structure is sound for Docker service discovery.

The Prometheus configuration correctly:

  • Uses Docker daemon at the docker_socket_proxy endpoint
  • Implements opt-in scraping via prometheus.io/scrape=true label filter
  • Maps container labels to Prometheus target labels for observability (service, config)
  • Allows per-service customization of metrics path and port

This provides a good foundation for discovering and scraping metrics from Dashmate services.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
packages/dashmate/docs/prometheus/docker-compose.yml (2)

18-18: Clarify container naming convention.

The container name prometheus-rsdapi is unclear and inconsistent with the service focus on dashmate. Consider renaming to prometheus-dashmate or simply prometheus for clarity.

Apply this diff to improve naming clarity:

  prometheus:
-   container_name: prometheus-rsdapi
+   container_name: prometheus-dashmate

19-19: Document or reconsider network_mode: host security implications.

Using network_mode: host exposes Prometheus on all network interfaces (by default, port 9080). For a local development setup, this may be acceptable, but consider:

  • Adding a note in accompanying documentation about security implications
  • Alternatively, binding to 127.0.0.1:9080 explicitly in the command args for defense-in-depth

If restricting to localhost is desired, you could adjust the command to bind explicitly:

    command:
      - --config.file=/etc/prometheus/prometheus.yml
      - --storage.tsdb.path=/prometheus
      - --web.enable-lifecycle
-     - --web.listen-address=:9080
+     - --web.listen-address=127.0.0.1:9080

Alternatively, document this as a local-development-only setup in a README or comment.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 57a0a67 and 1a6e248.

📒 Files selected for processing (4)
  • packages/dashmate/docs/prometheus/README.md (1 hunks)
  • packages/dashmate/docs/prometheus/docker-compose.yml (1 hunks)
  • packages/dashmate/docs/prometheus/prometheus.yml (1 hunks)
  • packages/dashmate/docs/services/index.md (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • packages/dashmate/docs/prometheus/README.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/dashmate/docs/services/index.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Build Docker images (RS-DAPI, rs-dapi, rs-dapi) / Build RS-DAPI image
  • GitHub Check: Build Docker images (Drive, drive, drive-abci) / Build Drive image
  • GitHub Check: Build Docker images (Dashmate helper, dashmate-helper, dashmate-helper) / Build Dashmate helper image
  • GitHub Check: Build JS packages / Build JS
  • GitHub Check: Rust crates security audit
  • GitHub Check: Determine changed packages
🔇 Additional comments (1)
packages/dashmate/docs/prometheus/prometheus.yml (1)

1-30: Configuration looks good for local development Docker SD setup.

The Prometheus configuration correctly sets up Docker service discovery with appropriate relabeling rules to map container labels to scrape targets. The hardcoded 127.0.0.1:2375 is suitable for the local socket proxy documented in the accompanying docker-compose.yml. The relabeling rules cleanly map the expected label structure (prometheus_io_, org.dashmate_) to Prometheus targets and labels.

const problems = [];

if (config?.get('platform.enable')) {
// Gateway admin is disabled while metrics are enabled
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you remove this check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

port_value: 8081
{{?}}
{{? it.platform.gateway.metrics.enabled && it.platform.gateway.admin.enabled }}
{{? it.platform.gateway.metrics.enabled }}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you remove this condition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metrics without admin don't work.

Assuming the intention of an user who enabled metrics is to get them really working ;), I've changed logic so that when metrics are enabled, always enable admin automatically but don't expose it to public.

See line 546-550 below. When admin is not enabled, adminHost will be 127.0.0.1, what means it will listen on loopback of docker container (not docker host).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants