Skip to content

Commit ca4c313

Browse files
Add multi-target support (#1063)
* Add multi-target support Signed-off-by: pincher95 <[email protected]> * Update example-prometheus.yml Signed-off-by: pincher95 <[email protected]> * Make `es.uri` optional by setting default to empty string check if it's empty and if so, don't parse it Signed-off-by: pincher95 <[email protected]> Signed-off-by: pincher95 <[email protected]> * Update README.md Signed-off-by: pincher95 <[email protected]> * Add sanity target scheme validation Signed-off-by: pincher95 <[email protected]> * Change yaml package to go.yaml.in/yaml/v3 Signed-off-by: pincher95 <[email protected]> * Update yaml package to go.yaml.in/yaml/v3 Signed-off-by: pincher95 <[email protected]> * Update CHANGELOG.md Signed-off-by: pincher95 <[email protected]> * Remove whitespaces from README.md Signed-off-by: pincher95 <[email protected]> * Add testing for apikey authentication module Update examples/auth_modules.yml Fix main.go to apply userpass credentials only if the module type is explicitly set to userpass. Signed-off-by: pincher95 <[email protected]> * Add Load-time validation for the auth module config file during startup Keep light-weight validation for the probe params during runtime Add AWS SigV4 authentication module support Signed-off-by: pincher95 <[email protected]> * Expose error in the logger Signed-off-by: pincher95 <[email protected]> * Add TLS config per target support Add TLS config validation Update config test to include TLS config Signed-off-by: pincher95 <[email protected]> * Indices and Shards collectors now fetch cluster_name once from GET / when no clusterinfo retriever is attached, avoiding the previous "unknown_cluster" label. Signed-off-by: pincher95 <[email protected]> * Removed the special-case logic that redirected /metrics?target= requests to /probe. Updated auth_modules.yml to include AWS SigV4 signing and mTLS support. Signed-off-by: pincher95 <[email protected]> * Add license headers to all new files Signed-off-by: pincher95 <[email protected]> * Fixes for relative paths in multi-target mode Signed-off-by: pincher95 <[email protected]> * Bump github.com/prometheus/client_golang from 1.22.0 to 1.23.0 (#1065) Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.22.0 to 1.23.0. - [Release notes](https://github.com/prometheus/client_golang/releases) - [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md) - [Commits](prometheus/client_golang@v1.22.0...v1.23.0) --- updated-dependencies: - dependency-name: github.com/prometheus/client_golang dependency-version: 1.23.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Signed-off-by: pincher95 <[email protected]> * Add target schema validation, http/https only Add tls auth type support in multi-target mode Update README.md, examples/auth_modules.yml, tests Signed-off-by: pincher95 <[email protected]> * Cleanup Signed-off-by: pincher95 <[email protected]> * Fix tls auth type validation Signed-off-by: pincher95 <[email protected]> * Remove aws.region validation Signed-off-by: pincher95 <[email protected]> * Add temp file cleanup in config_test.go Signed-off-by: pincher95 <[email protected]> * Add copyright header to config_test.go Signed-off-by: pincher95 <[email protected]> * Add version metric to the per-probe registry Update roundtripper.go to use region from config or environment resolver if not provided in config file (AWS_REGION) Update probe.go to accept module even if region omitted; environment resolver can provide it Update config.go to use region as optional field Update main.go to use region from config or environment resolver if not provided in config file (AWS_REGION) and update roundtripper.go to use region from config or environment resolver if not provided in config file (AWS_REGION) Signed-off-by: pincher95 <[email protected]> --------- Signed-off-by: pincher95 <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: Yuri Tsuprun <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
1 parent 5ceff33 commit ca4c313

File tree

14 files changed

+974
-103
lines changed

14 files changed

+974
-103
lines changed

CHANGELOG.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,12 @@
11
## master / unreleased
22

3+
### Added
4+
- Multi-target scraping via `/probe` endpoint with optional auth modules (compatible with postgres_exporter style) #1063
5+
36
BREAKING CHANGES:
47

8+
* [CHANGE] Set `--es.uri` by default to empty string #1063
9+
510
The flag `--es.data_stream` has been renamed to `--collector.data-stream`.
611
The flag `--es.ilm` has been renamed to `--collector.ilm`.
712

README.md

Lines changed: 63 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ elasticsearch_exporter --help
5555
| Argument | Introduced in Version | Description | Default |
5656
| ----------------------- | --------------------- |---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ----------- |
5757
| collector.clustersettings| 1.6.0 | If true, query stats for cluster settings (As of v1.6.0, this flag has replaced "es.cluster_settings"). | false |
58-
| es.uri | 1.0.2 | Address (host and port) of the Elasticsearch node we should connect to. This could be a local node (`localhost:9200`, for instance), or the address of a remote Elasticsearch server. When basic auth is needed, specify as: `<proto>://<user>:<password>@<host>:<port>`. E.G., `http://admin:pass@localhost:9200`. Special characters in the user credentials need to be URL-encoded. | <http://localhost:9200> |
58+
| es.uri | 1.0.2 | Address (host and port) of the Elasticsearch node we should connect to **when running in single-target mode**. Leave empty (the default) when you want to run the exporter only as a multi-target `/probe` endpoint. When basic auth is needed, specify as: `<proto>://<user>:<password>@<host>:<port>`. E.G., `http://admin:pass@localhost:9200`. Special characters in the user credentials need to be URL-encoded. | "" |
5959
| es.all | 1.0.2 | If true, query stats for all nodes in the cluster, rather than just the node we connect to. | false |
6060
| es.indices | 1.0.2 | If true, query stats for all indices in the cluster. | false |
6161
| es.indices_settings | 1.0.4rc1 | If true, query settings stats for all indices in the cluster. | false |
@@ -77,6 +77,7 @@ elasticsearch_exporter --help
7777
| web.telemetry-path | 1.0.2 | Path under which to expose metrics. | /metrics |
7878
| aws.region | 1.5.0 | Region for AWS elasticsearch | |
7979
| aws.role-arn | 1.6.0 | Role ARN of an IAM role to assume. | |
80+
| config.file | 2.0.0 | Path to a YAML configuration file that defines `auth_modules:` used by the `/probe` multi-target endpoint. Leave unset when not using multi-target mode. | |
8081
| version | 1.0.2 | Show version info on stdout and exit. | |
8182

8283
Commandline parameters start with a single `-` for versions less than `1.1.0rc1`.
@@ -113,6 +114,67 @@ Further Information
113114
- [Defining Roles](https://www.elastic.co/guide/en/elastic-stack-overview/7.3/defining-roles.html)
114115
- [Privileges](https://www.elastic.co/guide/en/elastic-stack-overview/7.3/security-privileges.html)
115116

117+
### Multi-Target Scraping (beta)
118+
119+
From v2.X the exporter exposes `/probe` allowing one running instance to scrape many clusters.
120+
121+
Supported `auth_module` types:
122+
123+
| type | YAML fields | Injected into request |
124+
| ---------- | ----------------------------------------------------------------- | ------------------------------------------------------------------------------------- |
125+
| `userpass` | `userpass.username`, `userpass.password`, optional `options:` map | Sets HTTP basic-auth header, appends `options` as query parameters |
126+
| `apikey` | `apikey:` Base64 API-Key string, optional `options:` map | Adds `Authorization: ApiKey …` header, appends `options` |
127+
| `aws` | `aws.region`, optional `aws.role_arn`, optional `options:` map | Uses AWS SigV4 signing transport for HTTP(S) requests, appends `options` |
128+
| `tls` | `tls.ca_file`, `tls.cert_file`, `tls.key_file` | Uses client certificate authentication via TLS; cannot be mixed with other auth types |
129+
130+
Example config:
131+
132+
```yaml
133+
# exporter-config.yml
134+
auth_modules:
135+
prod_basic:
136+
type: userpass
137+
userpass:
138+
username: metrics
139+
password: s3cr3t
140+
141+
staging_key:
142+
type: apikey
143+
apikey: "bXk6YXBpa2V5Ig==" # base64 id:key
144+
options:
145+
sslmode: disable
146+
```
147+
148+
Run exporter:
149+
150+
```bash
151+
./elasticsearch_exporter --config.file=exporter-config.yml
152+
```
153+
154+
Prometheus scrape_config:
155+
156+
```yaml
157+
- job_name: es
158+
metrics_path: /probe
159+
params:
160+
auth_module: [staging_key]
161+
static_configs:
162+
- targets: ["https://es-stage:9200"]
163+
relabel_configs:
164+
- source_labels: [__address__]
165+
target_label: __param_target
166+
- source_labels: [__param_target]
167+
target_label: instance
168+
- target_label: __address__
169+
replacement: exporter:9114
170+
```
171+
172+
Notes:
173+
- `/metrics` serves a single, process-wide registry and is intended for single-target mode.
174+
- `/probe` creates a fresh registry per scrape for the given `target` allowing multi-target scraping.
175+
- Any `options:` under an auth module will be appended as URL query parameters to the target URL.
176+
- The `tls` auth module (client certificate authentication) is intended for self‑managed Elasticsearch/OpenSearch deployments. Amazon OpenSearch Service typically authenticates at the domain edge with IAM/SigV4 and does not support client certificate authentication; use the `aws` auth module instead when scraping Amazon OpenSearch Service domains.
177+
116178
### Metrics
117179

118180
| Name | Type | Cardinality | Help |

collector/indices.go

Lines changed: 20 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ import (
1919
"log/slog"
2020
"net/http"
2121
"net/url"
22+
"path"
2223
"sort"
2324
"strconv"
2425

@@ -620,13 +621,28 @@ func (i *Indices) fetchAndDecodeIndexStats(ctx context.Context) (indexStatsRespo
620621
return isr, nil
621622
}
622623

623-
// getCluserName returns the name of the cluster from the clusterinfo
624-
// if the clusterinfo is nil, it returns "unknown_cluster"
625-
// TODO(@sysadmind): this should be removed once we have a better way to handle clusterinfo
624+
// getClusterName returns the cluster name. If no clusterinfo retriever is
625+
// attached (e.g. /probe mode) it performs a lightweight call to the root
626+
// endpoint once and caches the result.
626627
func (i *Indices) getClusterName() string {
627-
if i.lastClusterInfo != nil {
628+
if i.lastClusterInfo != nil && i.lastClusterInfo.ClusterName != "unknown_cluster" {
628629
return i.lastClusterInfo.ClusterName
629630
}
631+
u := *i.url
632+
u.Path = path.Join(u.Path, "/")
633+
resp, err := i.client.Get(u.String())
634+
if err == nil {
635+
defer resp.Body.Close()
636+
if resp.StatusCode == http.StatusOK {
637+
var root struct {
638+
ClusterName string `json:"cluster_name"`
639+
}
640+
if err := json.NewDecoder(resp.Body).Decode(&root); err == nil && root.ClusterName != "" {
641+
i.lastClusterInfo = &clusterinfo.Response{ClusterName: root.ClusterName}
642+
return root.ClusterName
643+
}
644+
}
645+
}
630646
return "unknown_cluster"
631647
}
632648

collector/shards.go

Lines changed: 33 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -64,23 +64,50 @@ type nodeShardMetric struct {
6464
Labels labels
6565
}
6666

67+
// fetchClusterNameOnce performs a single request to the root endpoint to obtain the cluster name.
68+
func fetchClusterNameOnce(s *Shards) string {
69+
if s.lastClusterInfo != nil && s.lastClusterInfo.ClusterName != "unknown_cluster" {
70+
return s.lastClusterInfo.ClusterName
71+
}
72+
u := *s.url
73+
u.Path = path.Join(u.Path, "/")
74+
resp, err := s.client.Get(u.String())
75+
if err == nil {
76+
defer resp.Body.Close()
77+
if resp.StatusCode == http.StatusOK {
78+
var root struct {
79+
ClusterName string `json:"cluster_name"`
80+
}
81+
if err := json.NewDecoder(resp.Body).Decode(&root); err == nil && root.ClusterName != "" {
82+
s.lastClusterInfo = &clusterinfo.Response{ClusterName: root.ClusterName}
83+
return root.ClusterName
84+
}
85+
}
86+
}
87+
return "unknown_cluster"
88+
}
89+
6790
// NewShards defines Shards Prometheus metrics
6891
func NewShards(logger *slog.Logger, client *http.Client, url *url.URL) *Shards {
92+
var shardPtr *Shards
6993
nodeLabels := labels{
7094
keys: func(...string) []string {
7195
return []string{"node", "cluster"}
7296
},
73-
values: func(lastClusterinfo *clusterinfo.Response, s ...string) []string {
97+
values: func(lastClusterinfo *clusterinfo.Response, base ...string) []string {
7498
if lastClusterinfo != nil {
75-
return append(s, lastClusterinfo.ClusterName)
99+
return append(base, lastClusterinfo.ClusterName)
76100
}
77-
// this shouldn't happen, as the clusterinfo Retriever has a blocking
78-
// Run method. It blocks until the first clusterinfo call has succeeded
79-
return append(s, "unknown_cluster")
101+
if shardPtr != nil {
102+
return append(base, fetchClusterNameOnce(shardPtr))
103+
}
104+
return append(base, "unknown_cluster")
80105
},
81106
}
82107

83108
shards := &Shards{
109+
// will assign later
110+
84111
logger: logger,
85112
client: client,
86113
url: url,
@@ -123,6 +150,7 @@ func NewShards(logger *slog.Logger, client *http.Client, url *url.URL) *Shards {
123150
logger.Debug("exiting cluster info receive loop")
124151
}()
125152

153+
shardPtr = shards
126154
return shards
127155
}
128156

config/config.go

Lines changed: 137 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,137 @@
1+
// Copyright The Prometheus Authors
2+
// Licensed under the Apache License, Version 2.0 (the "License");
3+
// you may not use this file except in compliance with the License.
4+
// You may obtain a copy of the License at
5+
//
6+
// http://www.apache.org/licenses/LICENSE-2.0
7+
//
8+
// Unless required by applicable law or agreed to in writing, software
9+
// distributed under the License is distributed on an "AS IS" BASIS,
10+
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
11+
// See the License for the specific language governing permissions and
12+
// limitations under the License.
13+
14+
package config
15+
16+
import (
17+
"fmt"
18+
"os"
19+
"strings"
20+
21+
"go.yaml.in/yaml/v3"
22+
)
23+
24+
// Config represents the YAML configuration file structure.
25+
type Config struct {
26+
AuthModules map[string]AuthModule `yaml:"auth_modules"`
27+
}
28+
29+
type AuthModule struct {
30+
Type string `yaml:"type"`
31+
UserPass *UserPassConfig `yaml:"userpass,omitempty"`
32+
APIKey string `yaml:"apikey,omitempty"`
33+
AWS *AWSConfig `yaml:"aws,omitempty"`
34+
TLS *TLSConfig `yaml:"tls,omitempty"`
35+
Options map[string]string `yaml:"options,omitempty"`
36+
}
37+
38+
// AWSConfig contains settings for SigV4 authentication.
39+
type AWSConfig struct {
40+
Region string `yaml:"region,omitempty"`
41+
RoleARN string `yaml:"role_arn,omitempty"`
42+
}
43+
44+
// TLSConfig allows per-target TLS options.
45+
type TLSConfig struct {
46+
CAFile string `yaml:"ca_file,omitempty"`
47+
CertFile string `yaml:"cert_file,omitempty"`
48+
KeyFile string `yaml:"key_file,omitempty"`
49+
InsecureSkipVerify bool `yaml:"insecure_skip_verify,omitempty"`
50+
}
51+
52+
type UserPassConfig struct {
53+
Username string `yaml:"username"`
54+
Password string `yaml:"password"`
55+
}
56+
57+
// validate ensures every auth module has the required fields according to its type.
58+
func (c *Config) validate() error {
59+
for name, am := range c.AuthModules {
60+
// Validate fields based on auth type
61+
switch strings.ToLower(am.Type) {
62+
case "userpass":
63+
if am.UserPass == nil || am.UserPass.Username == "" || am.UserPass.Password == "" {
64+
return fmt.Errorf("auth_module %s type userpass requires username and password", name)
65+
}
66+
case "apikey":
67+
if am.APIKey == "" {
68+
return fmt.Errorf("auth_module %s type apikey requires apikey", name)
69+
}
70+
case "aws":
71+
// No strict validation: region can come from environment/defaults; role_arn is optional.
72+
case "tls":
73+
// TLS auth type means client certificate authentication only (no other auth)
74+
if am.TLS == nil {
75+
return fmt.Errorf("auth_module %s type tls requires tls configuration section", name)
76+
}
77+
if am.TLS.CertFile == "" || am.TLS.KeyFile == "" {
78+
return fmt.Errorf("auth_module %s type tls requires cert_file and key_file for client certificate authentication", name)
79+
}
80+
// Validate that other auth fields are not set when using TLS auth type
81+
if am.UserPass != nil {
82+
return fmt.Errorf("auth_module %s type tls cannot have userpass configuration", name)
83+
}
84+
if am.APIKey != "" {
85+
return fmt.Errorf("auth_module %s type tls cannot have apikey", name)
86+
}
87+
if am.AWS != nil {
88+
return fmt.Errorf("auth_module %s type tls cannot have aws configuration", name)
89+
}
90+
default:
91+
return fmt.Errorf("auth_module %s has unsupported type %s", name, am.Type)
92+
}
93+
94+
// Validate TLS configuration (optional for all auth types, provides transport security)
95+
if am.TLS != nil {
96+
// For cert-based auth (type: tls), cert and key are required
97+
// For other auth types, TLS config is optional and used for transport security
98+
if strings.ToLower(am.Type) != "tls" {
99+
// For non-TLS auth types, if cert/key are provided, both must be present
100+
if (am.TLS.CertFile != "") != (am.TLS.KeyFile != "") {
101+
return fmt.Errorf("auth_module %s: if providing client certificate, both cert_file and key_file must be specified", name)
102+
}
103+
}
104+
105+
// Validate file accessibility
106+
for fileType, path := range map[string]string{
107+
"ca_file": am.TLS.CAFile,
108+
"cert_file": am.TLS.CertFile,
109+
"key_file": am.TLS.KeyFile,
110+
} {
111+
if path == "" {
112+
continue
113+
}
114+
if _, err := os.Stat(path); err != nil {
115+
return fmt.Errorf("auth_module %s: %s '%s' not accessible: %w", name, fileType, path, err)
116+
}
117+
}
118+
}
119+
}
120+
return nil
121+
}
122+
123+
// LoadConfig reads, parses, and validates the YAML config file.
124+
func LoadConfig(path string) (*Config, error) {
125+
data, err := os.ReadFile(path)
126+
if err != nil {
127+
return nil, err
128+
}
129+
var cfg Config
130+
if err := yaml.Unmarshal(data, &cfg); err != nil {
131+
return nil, err
132+
}
133+
if err := cfg.validate(); err != nil {
134+
return nil, err
135+
}
136+
return &cfg, nil
137+
}

0 commit comments

Comments
 (0)