Skip to content

Commit 18cfc86

Browse files
bentonampetewallgithub-actions[bot]
authored
Added Mimir Integration (#1136)
* Added Mimir Integration * updated meta-monitoring example * Update charts/k8s-monitoring/charts/feature-integrations/integrations/mimir-values.yaml Co-authored-by: Pete Wall <[email protected]> * Update charts/k8s-monitoring/charts/feature-integrations/integrations/mimir-values.yaml Co-authored-by: Pete Wall <[email protected]> * Updated docs * Doc improvements to link to other charts and reference resource requests and limits (#1131) * Doc improvements to link to other charts and reference resource requests and limits Signed-off-by: Pete Wall <[email protected]> * Fix yaml lint issues Signed-off-by: Pete Wall <[email protected]> --------- Signed-off-by: Pete Wall <[email protected]> * Update Update dependency "beyla" for Helm chart "feature-auto-instrumentation" to 1.6.3 (#1134) Signed-off-by: Pete Wall <[email protected]> Co-authored-by: petewall <[email protected]> * Update Update dependency "beyla" for Helm chart "k8s-monitoring-v1" to 1.6.3 (#1135) Signed-off-by: Pete Wall <[email protected]> Co-authored-by: petewall <[email protected]> * Add permissions for the internal directory Signed-off-by: Pete Wall <[email protected]> * Actually use certFile and keyFile settings (#1143) Signed-off-by: Pete Wall <[email protected]> * Bump versions to 1.6.21 and 2.0.4 Signed-off-by: Pete Wall <[email protected]> * Rebuilt --------- Signed-off-by: Pete Wall <[email protected]> Co-authored-by: Pete Wall <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: petewall <[email protected]>
1 parent 3470186 commit 18cfc86

24 files changed

+4089
-28
lines changed

charts/k8s-monitoring/charts/feature-integrations/README.md

+6
Original file line numberDiff line numberDiff line change
@@ -118,6 +118,12 @@ Be sure perform actual integration testing in a live environment in the main [k8
118118
|-----|------|---------|-------------|
119119
| loki | object | `{"instances":[]}` | Scrape metrics/logs from Loki |
120120

121+
### Integration: Mimir
122+
123+
| Key | Type | Default | Description |
124+
|-----|------|---------|-------------|
125+
| mimir | object | `{"instances":[]}` | Scrape metrics/logs from Mimir |
126+
121127
### Integration: MySQL
122128

123129
| Key | Type | Default | Description |
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,286 @@
1+
---
2+
# The set of metrics from Grafana Loki required for the Grafana Loki integration
3+
- cortex_alertmanager_alerts
4+
- cortex_alertmanager_alerts_invalid_total
5+
- cortex_alertmanager_alerts_received_total
6+
- cortex_alertmanager_dispatcher_aggregation_groups
7+
- cortex_alertmanager_notification_latency_seconds_bucket
8+
- cortex_alertmanager_notification_latency_seconds_count
9+
- cortex_alertmanager_notification_latency_seconds_sum
10+
- cortex_alertmanager_notifications_failed_total
11+
- cortex_alertmanager_notifications_total
12+
- cortex_alertmanager_partial_state_merges_failed_total
13+
- cortex_alertmanager_partial_state_merges_total
14+
- cortex_alertmanager_ring_check_errors_total
15+
- cortex_alertmanager_silences
16+
- cortex_alertmanager_state_fetch_replica_state_failed_total
17+
- cortex_alertmanager_state_fetch_replica_state_total
18+
- cortex_alertmanager_state_initial_sync_completed_total
19+
- cortex_alertmanager_state_initial_sync_duration_seconds_bucket
20+
- cortex_alertmanager_state_initial_sync_duration_seconds_count
21+
- cortex_alertmanager_state_initial_sync_duration_seconds_sum
22+
- cortex_alertmanager_state_persist_failed_total
23+
- cortex_alertmanager_state_persist_total
24+
- cortex_alertmanager_state_replication_failed_total
25+
- cortex_alertmanager_state_replication_total
26+
- cortex_alertmanager_sync_configs_failed_total
27+
- cortex_alertmanager_sync_configs_total
28+
- cortex_alertmanager_tenants_discovered
29+
- cortex_alertmanager_tenants_owned
30+
- cortex_blockbuilder_consume_cycle_duration_seconds
31+
- cortex_blockbuilder_consumer_lag_records
32+
- cortex_blockbuilder_tsdb_compact_and_upload_failed_total
33+
- cortex_bucket_blocks_count
34+
- cortex_bucket_index_estimated_compaction_jobs
35+
- cortex_bucket_index_estimated_compaction_jobs_errors_total
36+
- cortex_bucket_index_last_successful_update_timestamp_seconds
37+
- cortex_bucket_store_block_drop_failures_total
38+
- cortex_bucket_store_block_drops_total
39+
- cortex_bucket_store_block_load_failures_total
40+
- cortex_bucket_store_block_loads_total
41+
- cortex_bucket_store_blocks_loaded
42+
- cortex_bucket_store_indexheader_lazy_load_duration_seconds_bucket
43+
- cortex_bucket_store_indexheader_lazy_load_duration_seconds_count
44+
- cortex_bucket_store_indexheader_lazy_load_duration_seconds_sum
45+
- cortex_bucket_store_indexheader_lazy_load_total
46+
- cortex_bucket_store_indexheader_lazy_unload_total
47+
- cortex_bucket_store_series_batch_preloading_load_duration_seconds_sum
48+
- cortex_bucket_store_series_batch_preloading_wait_duration_seconds_sum
49+
- cortex_bucket_store_series_blocks_queried_sum
50+
- cortex_bucket_store_series_data_size_fetched_bytes_sum
51+
- cortex_bucket_store_series_data_size_touched_bytes_sum
52+
- cortex_bucket_store_series_hash_cache_hits_total
53+
- cortex_bucket_store_series_hash_cache_requests_total
54+
- cortex_bucket_store_series_request_stage_duration_seconds_bucket
55+
- cortex_bucket_store_series_request_stage_duration_seconds_count
56+
- cortex_bucket_store_series_request_stage_duration_seconds_sum
57+
- cortex_bucket_stores_blocks_last_successful_sync_timestamp_seconds
58+
- cortex_bucket_stores_gate_duration_seconds_bucket
59+
- cortex_bucket_stores_gate_duration_seconds_count
60+
- cortex_bucket_stores_gate_duration_seconds_sum
61+
- cortex_bucket_stores_tenants_synced
62+
- cortex_build_info
63+
- cortex_cache_memory_hits_total
64+
- cortex_cache_memory_requests_total
65+
- cortex_compactor_block_cleanup_failures_total
66+
- cortex_compactor_block_cleanup_last_successful_run_timestamp_seconds
67+
- cortex_compactor_block_max_time_delta_seconds_bucket
68+
- cortex_compactor_block_max_time_delta_seconds_count
69+
- cortex_compactor_block_max_time_delta_seconds_sum
70+
- cortex_compactor_blocks_cleaned_total
71+
- cortex_compactor_blocks_marked_for_deletion_total
72+
- cortex_compactor_blocks_marked_for_no_compaction_total
73+
- cortex_compactor_disk_out_of_space_errors_total
74+
- cortex_compactor_group_compaction_runs_started_total
75+
- cortex_compactor_last_successful_run_timestamp_seconds
76+
- cortex_compactor_meta_sync_duration_seconds_bucket
77+
- cortex_compactor_meta_sync_duration_seconds_count
78+
- cortex_compactor_meta_sync_duration_seconds_sum
79+
- cortex_compactor_meta_sync_failures_total
80+
- cortex_compactor_meta_syncs_total
81+
- cortex_compactor_runs_completed_total
82+
- cortex_compactor_runs_failed_total
83+
- cortex_compactor_runs_started_total
84+
- cortex_compactor_tenants_discovered
85+
- cortex_compactor_tenants_processing_failed
86+
- cortex_compactor_tenants_processing_succeeded
87+
- cortex_compactor_tenants_skipped
88+
- cortex_config_hash
89+
- cortex_discarded_exemplars_total
90+
- cortex_discarded_requests_total
91+
- cortex_discarded_samples_total
92+
- cortex_distributor_deduped_samples_total
93+
- cortex_distributor_exemplars_in_total
94+
- cortex_distributor_inflight_push_requests
95+
- cortex_distributor_instance_limits
96+
- cortex_distributor_instance_rejected_requests_total
97+
- cortex_distributor_latest_seen_sample_timestamp_seconds
98+
- cortex_distributor_non_ha_samples_received_total
99+
- cortex_distributor_received_exemplars_total
100+
- cortex_distributor_received_requests_total
101+
- cortex_distributor_received_samples_total
102+
- cortex_distributor_replication_factor
103+
- cortex_distributor_requests_in_total
104+
- cortex_distributor_samples_in_total
105+
- cortex_inflight_requests
106+
- cortex_ingest_storage_reader_buffered_fetched_records
107+
- cortex_ingest_storage_reader_fetch_errors_total
108+
- cortex_ingest_storage_reader_fetches_total
109+
- cortex_ingest_storage_reader_missed_records_total
110+
- cortex_ingest_storage_reader_offset_commit_failures_total
111+
- cortex_ingest_storage_reader_offset_commit_requests_total
112+
- cortex_ingest_storage_reader_read_errors_total
113+
- cortex_ingest_storage_reader_receive_delay_seconds_count
114+
- cortex_ingest_storage_reader_receive_delay_seconds_sum
115+
- cortex_ingest_storage_reader_records_failed_total
116+
- cortex_ingest_storage_reader_records_total
117+
- cortex_ingest_storage_reader_requests_failed_total
118+
- cortex_ingest_storage_reader_requests_total
119+
- cortex_ingest_storage_strong_consistency_failures_total
120+
- cortex_ingest_storage_strong_consistency_requests_total
121+
- cortex_ingest_storage_writer_buffered_produce_bytes
122+
- cortex_ingest_storage_writer_buffered_produce_bytes_limit
123+
- cortex_ingester_active_native_histogram_buckets
124+
- cortex_ingester_active_native_histogram_buckets_custom_tracker
125+
- cortex_ingester_active_native_histogram_series
126+
- cortex_ingester_active_native_histogram_series_custom_tracker
127+
- cortex_ingester_active_series
128+
- cortex_ingester_active_series_custom_tracker
129+
- cortex_ingester_client_request_duration_seconds_bucket
130+
- cortex_ingester_client_request_duration_seconds_count
131+
- cortex_ingester_client_request_duration_seconds_sum
132+
- cortex_ingester_ingested_exemplars_total
133+
- cortex_ingester_ingested_samples_total
134+
- cortex_ingester_instance_limits
135+
- cortex_ingester_instance_rejected_requests_total
136+
- cortex_ingester_local_limits
137+
- cortex_ingester_memory_series
138+
- cortex_ingester_memory_series_created_total
139+
- cortex_ingester_memory_series_removed_total
140+
- cortex_ingester_memory_users
141+
- cortex_ingester_oldest_unshipped_block_timestamp_seconds
142+
- cortex_ingester_owned_series
143+
- cortex_ingester_queried_exemplars_bucket
144+
- cortex_ingester_queried_exemplars_count
145+
- cortex_ingester_queried_exemplars_sum
146+
- cortex_ingester_queried_samples_bucket
147+
- cortex_ingester_queried_samples_count
148+
- cortex_ingester_queried_samples_sum
149+
- cortex_ingester_queried_series_bucket
150+
- cortex_ingester_queried_series_count
151+
- cortex_ingester_queried_series_sum
152+
- cortex_ingester_shipper_last_successful_upload_timestamp_seconds
153+
- cortex_ingester_shipper_upload_failures_total
154+
- cortex_ingester_shipper_uploads_total
155+
- cortex_ingester_tsdb_checkpoint_creations_failed_total
156+
- cortex_ingester_tsdb_checkpoint_creations_total
157+
- cortex_ingester_tsdb_checkpoint_deletions_failed_total
158+
- cortex_ingester_tsdb_compaction_duration_seconds_bucket
159+
- cortex_ingester_tsdb_compaction_duration_seconds_count
160+
- cortex_ingester_tsdb_compaction_duration_seconds_sum
161+
- cortex_ingester_tsdb_compactions_failed_total
162+
- cortex_ingester_tsdb_compactions_total
163+
- cortex_ingester_tsdb_exemplar_exemplars_appended_total
164+
- cortex_ingester_tsdb_exemplar_exemplars_in_storage
165+
- cortex_ingester_tsdb_exemplar_last_exemplars_timestamp_seconds
166+
- cortex_ingester_tsdb_exemplar_series_with_exemplars_in_storage
167+
- cortex_ingester_tsdb_head_max_timestamp_seconds
168+
- cortex_ingester_tsdb_head_truncations_failed_total
169+
- cortex_ingester_tsdb_mmap_chunk_corruptions_total
170+
- cortex_ingester_tsdb_out_of_order_samples_appended_total
171+
- cortex_ingester_tsdb_storage_blocks_bytes
172+
- cortex_ingester_tsdb_symbol_table_size_bytes
173+
- cortex_ingester_tsdb_wal_corruptions_total
174+
- cortex_ingester_tsdb_wal_truncate_duration_seconds_count
175+
- cortex_ingester_tsdb_wal_truncate_duration_seconds_sum
176+
- cortex_ingester_tsdb_wal_truncations_failed_total
177+
- cortex_ingester_tsdb_wal_truncations_total
178+
- cortex_ingester_tsdb_wal_writes_failed_total
179+
- cortex_kv_request_duration_seconds_bucket
180+
- cortex_kv_request_duration_seconds_count
181+
- cortex_kv_request_duration_seconds_sum
182+
- cortex_lifecycler_read_only
183+
- cortex_limits_defaults
184+
- cortex_limits_overrides
185+
- cortex_partition_ring_partitions
186+
- cortex_prometheus_notifications_dropped_total
187+
- cortex_prometheus_notifications_errors_total
188+
- cortex_prometheus_notifications_queue_capacity
189+
- cortex_prometheus_notifications_queue_length
190+
- cortex_prometheus_notifications_sent_total
191+
- cortex_prometheus_rule_evaluation_duration_seconds_count
192+
- cortex_prometheus_rule_evaluation_duration_seconds_sum
193+
- cortex_prometheus_rule_evaluation_failures_total
194+
- cortex_prometheus_rule_evaluations_total
195+
- cortex_prometheus_rule_group_duration_seconds_count
196+
- cortex_prometheus_rule_group_duration_seconds_sum
197+
- cortex_prometheus_rule_group_iterations_missed_total
198+
- cortex_prometheus_rule_group_iterations_total
199+
- cortex_prometheus_rule_group_rules
200+
- cortex_querier_blocks_consistency_checks_failed_total
201+
- cortex_querier_blocks_consistency_checks_total
202+
- cortex_querier_request_duration_seconds_bucket
203+
- cortex_querier_request_duration_seconds_count
204+
- cortex_querier_request_duration_seconds_sum
205+
- cortex_querier_storegateway_instances_hit_per_query_bucket
206+
- cortex_querier_storegateway_instances_hit_per_query_count
207+
- cortex_querier_storegateway_instances_hit_per_query_sum
208+
- cortex_querier_storegateway_refetches_per_query_bucket
209+
- cortex_querier_storegateway_refetches_per_query_count
210+
- cortex_querier_storegateway_refetches_per_query_sum
211+
- cortex_query_frontend_queries_total
212+
- cortex_query_frontend_queue_duration_seconds_bucket
213+
- cortex_query_frontend_queue_duration_seconds_count
214+
- cortex_query_frontend_queue_duration_seconds_sum
215+
- cortex_query_frontend_queue_length
216+
- cortex_query_frontend_retries_bucket
217+
- cortex_query_frontend_retries_count
218+
- cortex_query_frontend_retries_sum
219+
- cortex_query_scheduler_connected_querier_clients
220+
- cortex_query_scheduler_querier_inflight_requests
221+
- cortex_query_scheduler_queue_duration_seconds_bucket
222+
- cortex_query_scheduler_queue_duration_seconds_count
223+
- cortex_query_scheduler_queue_duration_seconds_sum
224+
- cortex_query_scheduler_queue_length
225+
- cortex_request_duration_seconds
226+
- cortex_request_duration_seconds_bucket
227+
- cortex_request_duration_seconds_count
228+
- cortex_request_duration_seconds_sum
229+
- cortex_ring_members
230+
- cortex_ruler_managers_total
231+
- cortex_ruler_queries_failed_total
232+
- cortex_ruler_queries_total
233+
- cortex_ruler_ring_check_errors_total
234+
- cortex_ruler_write_requests_failed_total
235+
- cortex_ruler_write_requests_total
236+
- cortex_runtime_config_hash
237+
- cortex_runtime_config_last_reload_successful
238+
- cortex_tcp_connections
239+
- cortex_tcp_connections_limit
240+
- go_memstats_heap_inuse_bytes
241+
- keda_scaler_errors
242+
- keda_scaler_metrics_value
243+
- kube_deployment_spec_replicas
244+
- kube_deployment_status_replicas_unavailable
245+
- kube_deployment_status_replicas_updated
246+
- kube_endpoint_address
247+
- kube_horizontalpodautoscaler_spec_target_metric
248+
- kube_horizontalpodautoscaler_status_condition
249+
- kube_pod_info
250+
- kube_statefulset_replicas
251+
- kube_statefulset_status_current_revision
252+
- kube_statefulset_status_replicas_current
253+
- kube_statefulset_status_replicas_ready
254+
- kube_statefulset_status_replicas_updated
255+
- kube_statefulset_status_update_revision
256+
- kubelet_volume_stats_capacity_bytes
257+
- kubelet_volume_stats_used_bytes
258+
- memberlist_client_cluster_members_count
259+
- memcached_limit_bytes
260+
- mimir_continuous_test_queries_failed_total
261+
- mimir_continuous_test_query_result_checks_failed_total
262+
- mimir_continuous_test_writes_failed_total
263+
- node_disk_read_bytes_total
264+
- node_disk_written_bytes_total
265+
- process_memory_map_areas
266+
- process_memory_map_areas_limit
267+
- prometheus_tsdb_compaction_duration_seconds_bucket
268+
- prometheus_tsdb_compaction_duration_seconds_count
269+
- prometheus_tsdb_compaction_duration_seconds_sum
270+
- prometheus_tsdb_compactions_total
271+
- rollout_operator_last_successful_group_reconcile_timestamp_seconds
272+
- thanos_cache_hits_total
273+
- thanos_cache_operation_duration_seconds_bucket
274+
- thanos_cache_operation_duration_seconds_count
275+
- thanos_cache_operation_duration_seconds_sum
276+
- thanos_cache_operation_failures_total
277+
- thanos_cache_operations_total
278+
- thanos_cache_requests_total
279+
- thanos_objstore_bucket_last_successful_upload_time
280+
- thanos_objstore_bucket_operation_duration_seconds_bucket
281+
- thanos_objstore_bucket_operation_duration_seconds_count
282+
- thanos_objstore_bucket_operation_duration_seconds_sum
283+
- thanos_objstore_bucket_operation_failures_total
284+
- thanos_objstore_bucket_operations_total
285+
- thanos_store_index_cache_hits_total
286+
- thanos_store_index_cache_requests_total
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# mimir
2+
3+
## Values
4+
5+
### Discovery Settings
6+
7+
| Key | Type | Default | Description |
8+
|-----|------|---------|-------------|
9+
| fieldSelectors | list | `[]` | Discover Mimir instances based on field selectors. |
10+
| labelSelectors | object | `{"app.kubernetes.io/name":"mimir"}` | Discover Mimir instances based on label selectors. |
11+
| namespaces | list | `[]` | Namespaces to look for Mimir instances in. Will automatically look for Mimir instances in all namespaces unless specified here |
12+
13+
### Logs Settings
14+
15+
| Key | Type | Default | Description |
16+
|-----|------|---------|-------------|
17+
| logs.enabled | bool | `true` | Whether to enable special processing of Mimir pod logs. |
18+
| logs.tuning.dropLogLevels | list | `[]` | The log levels to drop. Will automatically keep all log levels unless specified here. |
19+
| logs.tuning.excludeLines | list | `[]` | Line patterns (valid RE2 regular expression)to exclude from the logs. |
20+
| logs.tuning.scrubTimestamp | bool | `true` | Whether the timestamp should be scrubbed from the log line |
21+
| logs.tuning.structuredMetadata | object | `{}` | The structured metadata mappings to set. To not set any structured metadata, set this to an empty object (e.g. `{}`) |
22+
| logs.tuning.timestampFormat | string | `"RFC3339Nano"` | The timestamp format to use for the log line, if not set the default timestamp which is the collection will be used for the log line |
23+
24+
### Metrics Settings
25+
26+
| Key | Type | Default | Description |
27+
|-----|------|---------|-------------|
28+
| metrics.enabled | bool | `true` | Whether to enable metrics collection from Mimir. |
29+
| metrics.portName | string | `"http-metrics"` | Name of the port to scrape metrics from. |
30+
| metrics.scrapeInterval | string | `60s` | How frequently to scrape metrics from Mimir. |
31+
32+
### Metric Processing Settings
33+
34+
| Key | Type | Default | Description |
35+
|-----|------|---------|-------------|
36+
| metrics.maxCacheSize | string | `100000` | Sets the max_cache_size for prometheus.relabel component. This should be at least 2x-5x your largest scrape target or samples appended rate. ([docs](https://grafana.com/docs/alloy/latest/reference/components/prometheus.relabel/#arguments)) Overrides global.maxCacheSize |
37+
| metrics.tuning.excludeMetrics | list | `[]` | Metrics to drop. Can use regular expressions. |
38+
| metrics.tuning.includeMetrics | list | `[]` | Metrics to keep. Can use regular expressions. |
39+
| metrics.tuning.useDefaultAllowList | bool | `true` | Filter the list of metrics from Grafana Mimir to the minimal set required for the Grafana Mimir integration. |
40+
41+
### General Settings
42+
43+
| Key | Type | Default | Description |
44+
|-----|------|---------|-------------|
45+
| name | string | `""` | Name for this Mimir instance. |

0 commit comments

Comments
 (0)