Skip to content

Fixed postgresql>=10 secondary server lag always 0, SuperQ proposed a… #977

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ARPABoy
Copy link

@ARPABoy ARPABoy commented Nov 29, 2023

Fixed postgresql>=10 secondary server lag always 0, SuperQ proposed a more clean code solution :), pg_replication_test modified to test pgReplicationQueryBeforeVersion10 or pgReplicationQueryAfterVersion10 depending of the postgresql version

… more clean code solution :), pg_replication_test modified to test pgReplicationQueryBeforeVersion10 or pgReplicationQueryAfterVersion10 depending of the postgresql version

Signed-off-by: kr0m <kr0m@Garrus.alfaexploit.com>
Copy link
Contributor

@SuperQ SuperQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thanks!

@SuperQ
Copy link
Contributor

SuperQ commented Nov 29, 2023

To add some more information here. It was discovered that if replication is failed, say blocked with an iptables firewall rule, pg_last_wal_receive_lsn() and pg_last_wal_replay_lsn() will remain the same. This masks the lag value.

IMO, we should probably expose separate metrics for each of these things rather than compute it in the exporter.

@SuperQ SuperQ requested a review from sysadmind November 29, 2023 10:04
@SuperQ
Copy link
Contributor

SuperQ commented Nov 29, 2023

This is a fix to the bug introduced by #895.

CC @IamLuksha.

@IamLuksha
Copy link
Contributor

Maybe I'm wrong
There is no way to check and test right now
But in versions above 9.3 pg_last_xact_replay_timestamp does not work correctly. And in the proposed commit it is used for versions above 10

The case of channel blocking must be considered separately. Check connection between devices

@SuperQ
Copy link
Contributor

SuperQ commented Nov 29, 2023

@IamLuksha Are there recommended metrics/alerts for monitoring replication? Server side? Client side?

I think I am beginning to understand some of how this is being done. I'm going to propose an additional metric that will will help with this.

Maybe we should add some example alerts to the mixin.

@IamLuksha
Copy link
Contributor

IamLuksha commented Nov 29, 2023

@SuperQ

Are there recommended metrics/alerts for monitoring replication? Server side? Client side?

Myabe reply_time. I need some time for test. Maybe next week. I don't have a working cluster right now

An additional metric is a great idea. Because it will clearly indicate the problem.
Since this commit will return issue #895

@ARPABoy
Copy link
Author

ARPABoy commented Nov 30, 2023

Are there recommended metrics/alerts for monitoring replication? Server-side? Client-side?
If you want to monitor lag looking at the Primary server I think it would be better to monitor replay_lag but if there are more than one synchronized Secondary servers you have to know the IP/FQDN of the Secondary to assign an identification tag to the metric.

You can detect broken synchronization on Secondary servers, if you know It's IP/FQDN:
SELECT COUNT(*) FROM pg_stat_replication WHERE client_addr='SLAVE_IP' AND state = 'streaming';

And you can get the LAG using replay_lag metric in that way:
SELECT COALESCE(EXTRACT(EPOCH FROM replay_lag)::bigint, 0) AS replay_lag FROM pg_stat_replication WHERE client_addr='SLAVE_IP';

Anyway, with the current Secondary metric monitoring, we can monitor lag if we apply the proposed patch, there are two queries, one for PostgreSQL>=10 and another for PostgreSQL<10.

The old versions will execute:
pgReplicationQueryBeforeVersion10 = SELECT
CASE
WHEN NOT pg_is_in_recovery() THEN 0
WHEN pg_last_wal_receive_lsn () = pg_last_wal_replay_lsn () THEN 0
ELSE GREATEST (0, EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp())))
END AS lag,
CASE
WHEN pg_is_in_recovery() THEN 1
ELSE 0
END as is_replica

And the new versions will execute:
pgReplicationQueryAfterVersion10 = SELECT
CASE
WHEN NOT pg_is_in_recovery() THEN 0
ELSE GREATEST (0, EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp())))
END AS lag,
CASE
WHEN pg_is_in_recovery() THEN 1
ELSE 0
END as is_replica

I think that by monitoring pg_up(Primary/Secondary) and pg_replication_lag_seconds(Secondary) metric we have all the troubleshooting conflicts covered.

@IamLuksha
Copy link
Contributor

Anyway, with the current Secondary metric monitoring, we can monitor lag if we apply the proposed patch, there are two queries, one for PostgreSQL>=10 and another for PostgreSQL<10.

This method does not work and was abandoned earlier
If there are no changes in the database for a long time and this is normal
We will get a replication error

@ARPABoy
Copy link
Author

ARPABoy commented Nov 30, 2023

@IamLuksha This method does not work in postgresql>=10 or postgresql<10 or both?

@IamLuksha
Copy link
Contributor

@IamLuksha This method does not work in postgresql>=10 or postgresql<10 or both?

both

		WHEN NOT pg_is_in_recovery() THEN 0
                WHEN pg_last_wal_receive_lsn () = pg_last_wal_replay_lsn () THEN 0
		ELSE GREATEST (0, EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp())))

This code for version > 9.3. But you propose to use it for <10

And your 'postgresql>=10' will give a replication error with an empty database or when it is rarely used.
In the end, having solved your problem, you return the old ones

In your case you need to check the connection between the servers

@ARPABoy
Copy link
Author

ARPABoy commented Nov 30, 2023

@IamLuksha Correct me if I am wrong.

It can be two kind of lag:

  • Local-Secondary server lag, the information was received by the Secondary server and written to disk but it hasn't been already dumped to DB.
  • Network lag between Primary and Secondary servers.

https://www.postgresql.org/docs/current/functions-admin.html

For Local lag we can check:
pg_last_wal_receive_lsn (): Returns the last write-ahead log location that has been received and synced to disk by streaming replication.
pg_last_wal_replay_lsn (): Returns the last write-ahead log location that has been replayed during recovery.

For Network lag:
pg_last_xact_replay_timestamp (): Returns the time stamp of the last transaction replayed during recovery. This is the time at which the commit or abort WAL record for that transaction was generated on the primary.

The problem with using pg_last_xact_replay_timestamp is that it remains the same value when there's no activity in the Primary server.

Have I understood the whole problem correctly?

@IamLuksha
Copy link
Contributor

Have I understood the whole problem correctly?

Yes!

So we need to check pg_last_wal_receive_lsn () = pg_last_wal_replay_lsn ()

@ARPABoy
Copy link
Author

ARPABoy commented Nov 30, 2023

When you say: In your case you need to check the connection between the servers

Check the connection using node_exporter or any other external monitoring system? Or querying some PosqtgreSQL data?

@IamLuksha
Copy link
Contributor

Check the connection using node_exporter or any other external monitoring system? Or querying some PosqtgreSQL data?

Every method can be used

I don't remember if Postgres has a method to check the connection.
I won't be able to check until next week.

@ARPABoy
Copy link
Author

ARPABoy commented Dec 4, 2023

Hello @IamLuksha , what do you think about monitoring lag and availability from Primary server?

You can detect broken synchronizations, if you know It's IP/FQDN:
SELECT COUNT(*) FROM pg_stat_replication WHERE client_addr='SECONDARY_IP' AND state = 'streaming';

And you can get the LAG using replay_lag metric in this way:
SELECT COALESCE(EXTRACT(EPOCH FROM replay_lag)::bigint, 0) AS replay_lag FROM pg_stat_replication WHERE client_addr='SECONDARY_IP';

The only problem that I can detect in this way is that when the Secondary server is unavailable from Primary, pg_stat_replication returns no results, so it simply disappears, the only solution that I have thought is saving previous watched Secondary servers in a list file, and if someone of them disappears, then trigger an alarm.

Do you think it's a worthy approximation? Any suggestion or solution?

bitfehler added a commit to bitfehler/postgres_exporter that referenced this pull request Nov 6, 2024
The exported replication lag does not handle all failure modes, and can
report 0 for replicas that are out of sync and incapable of recovery.

A proper replacement for that metric would require a different approach
(see e.g. prometheus-community#1007), but for a lot of folks, simply exporting the age of
the last replay can provide a pretty strong signal for something being
amiss.

I think this solution might be preferrable to prometheus-community#977, though the lag
metric needs to be fixed or abandoned eventually.

Signed-off-by: Conrad Hoffmann <ch@bitfehler.net>
bitfehler added a commit to bitfehler/postgres_exporter that referenced this pull request Nov 6, 2024
The exported replication lag does not handle all failure modes, and can
report 0 for replicas that are out of sync and incapable of recovery.

A proper replacement for that metric would require a different approach
(see e.g. prometheus-community#1007), but for a lot of folks, simply exporting the age of
the last replay can provide a pretty strong signal for something being
amiss.

I think this solution might be preferable to prometheus-community#977, though the lag
metric needs to be fixed or abandoned eventually.

Signed-off-by: Conrad Hoffmann <ch@bitfehler.net>
bitfehler added a commit to bitfehler/postgres_exporter that referenced this pull request Nov 6, 2024
The exported replication lag does not handle all failure modes, and can
report 0 for replicas that are out of sync and incapable of recovery.

A proper replacement for that metric would require a different approach
(see e.g. prometheus-community#1007), but for a lot of folks, simply exporting the age of
the last replay can provide a pretty strong signal for something being
amiss.

I think this solution might be preferable to prometheus-community#977, though the lag
metric needs to be fixed or abandoned eventually.

Signed-off-by: Conrad Hoffmann <ch@bitfehler.net>
bitfehler added a commit to bitfehler/postgres_exporter that referenced this pull request Nov 12, 2024
The exported replication lag does not handle all failure modes, and can
report 0 for replicas that are out of sync and incapable of recovery.

A proper replacement for that metric would require a different approach
(see e.g. prometheus-community#1007), but for a lot of folks, simply exporting the age of
the last replay can provide a pretty strong signal for something being
amiss.

I think this solution might be preferable to prometheus-community#977, though the lag
metric needs to be fixed or abandoned eventually.

Signed-off-by: Conrad Hoffmann <ch@bitfehler.net>
sysadmind pushed a commit that referenced this pull request Feb 15, 2025
The exported replication lag does not handle all failure modes, and can
report 0 for replicas that are out of sync and incapable of recovery.

A proper replacement for that metric would require a different approach
(see e.g. #1007), but for a lot of folks, simply exporting the age of
the last replay can provide a pretty strong signal for something being
amiss.

I think this solution might be preferable to #977, though the lag
metric needs to be fixed or abandoned eventually.

Signed-off-by: Conrad Hoffmann <ch@bitfehler.net>
Sticksman added a commit to Sticksman/postgres_exporter that referenced this pull request Apr 24, 2025
* Update common Prometheus files (prometheus-community#913)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Add changelog for v0.14 (prometheus-community#906)

* Add changelog for v0.14

- Add changelog entries since v0.13.2
- Update README with new options
- Bump version file

Signed-off-by: Joe Adams <github@joeadams.io>

* Add changelog entry for prometheus-community#904

Signed-off-by: Joe Adams <github@joeadams.io>

---------

Signed-off-by: Joe Adams <github@joeadams.io>

* Adds 1kB and 2kB units (prometheus-community#915)

Signed-off-by: Eric tyrrell <eric.tyrrell18+github@gmail.com>

* Add error log when probe collector creation fails (prometheus-community#918)

Signed-off-by: Joe Adams <github@joeadams.io>

* Fix test build failures on 32-bit arch again (prometheus-community#919)

Another case of untyped integer overflows on 32-bit arch.

Signed-off-by: Daniel Swarbrick <daniel.swarbrick@gmail.com>

* Add 32-bit testing to CI (prometheus-community#920)

Run Go tests with 32-bit to validate value overflow.

Signed-off-by: SuperQ <superq@gmail.com>

* Bump github.com/prometheus/client_golang from 1.16.0 to 1.17.0 (prometheus-community#925)

* Bump github.com/prometheus/client_golang from 1.16.0 to 1.17.0

Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.16.0 to 1.17.0.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](prometheus/client_golang@v1.16.0...v1.17.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update tests for latest client_golang.

Signed-off-by: SuperQ <superq@gmail.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: SuperQ <superq@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: SuperQ <superq@gmail.com>

* Update common Prometheus files (prometheus-community#926)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Adjust collector to use separate connection per scrape (prometheus-community#931)

Fixes prometheus-community#921

Signed-off-by: Joe Adams <github@joeadams.io>

* Bump golang.org/x/net from 0.10.0 to 0.17.0 (prometheus-community#936)

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.10.0 to 0.17.0.
- [Commits](golang/net@v0.10.0...v0.17.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Release v0.15.0 (prometheus-community#944)

* [ENHANCEMENT] Add 1kB and 2kB units prometheus-community#915
* [BUGFIX] Add error log when probe collector creation fails prometheus-community#918
* [BUGFIX] Fix test build failures on 32-bit arch prometheus-community#919
* [BUGFIX] Adjust collector to use separate connection per scrape prometheus-community#936

Signed-off-by: SuperQ <superq@gmail.com>

* Update common Prometheus files (prometheus-community#951)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Update common Prometheus files (prometheus-community#963)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* pg_replication_slot: add slot type label (prometheus-community#960)

Signed-off-by: Alex Simenduev <shamil.si@gmail.com>

* Bump github.com/prometheus/common from 0.44.0 to 0.45.0 (prometheus-community#948)

Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.44.0 to 0.45.0.
- [Release notes](https://github.com/prometheus/common/releases)
- [Commits](prometheus/common@v0.44.0...v0.45.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/common
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump github.com/prometheus/client_model (prometheus-community#949)

Bumps [github.com/prometheus/client_model](https://github.com/prometheus/client_model) from 0.4.1-0.20230718164431-9a2bf3000d16 to 0.5.0.
- [Release notes](https://github.com/prometheus/client_model/releases)
- [Commits](https://github.com/prometheus/client_model/commits/v0.5.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_model
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* pg_stat_database: added support for `active_time` counter (prometheus-community#961)

* feat(pg_stat_database): active time metric

---------

Signed-off-by: Jiri Sveceny <jiri.sveceny@icloud.com>

* Bump golang.org/x/crypto from 0.14.0 to 0.17.0 (prometheus-community#988)

Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from 0.14.0 to 0.17.0.
- [Commits](golang/crypto@v0.14.0...v0.17.0)

---
updated-dependencies:
- dependency-name: golang.org/x/crypto
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump github.com/prometheus/client_golang from 1.17.0 to 1.18.0 (prometheus-community#993)

Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.17.0 to 1.18.0.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](prometheus/client_golang@v1.17.0...v1.18.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* use Info level for excluded databases log message (prometheus-community#1003)

This is the only log message which didn't specify a level in the
postgres_exporter. I am unsure if this log message should be info or
debug, but leaning towards the more important since previously it would
just always log.

The way I validated this was the only non-leveled logger was via grep.
Both of these only returned this callsite previously:

  git grep 'logger\.Log'
  git grep '\.Log(' | grep -v level

Signed-off-by: Keegan Carruthers-Smith <keegan.csmith@gmail.com>

* Add connection limits metrics for pg_roles and pg_database (prometheus-community#997)

* Add database connection limits metrics

Signed-off-by: Jocelyn Thode <jocelyn@thode.email>

* Add roles connection limits metrics

Signed-off-by: Jocelyn Thode <jocelyn@thode.email>

* Fix copyright year

Co-authored-by: Joe Adams <github@joeadams.io>
Signed-off-by: Jocelyn Thode <jocelynthode@users.noreply.github.com>

* Fix spacing in pgDatabaseQuery

Co-authored-by: Joe Adams <github@joeadams.io>
Signed-off-by: Jocelyn Thode <jocelynthode@users.noreply.github.com>

* Fix case on pgRolesConnectionLimitsQuery

Co-authored-by: Joe Adams <github@joeadams.io>
Signed-off-by: Jocelyn Thode <jocelynthode@users.noreply.github.com>

* Do not add roleMetrics when row is not valid

Signed-off-by: Jocelyn Thode <jocelyn@thode.email>

---------

Signed-off-by: Jocelyn Thode <jocelyn@thode.email>
Signed-off-by: Jocelyn Thode <jocelynthode@users.noreply.github.com>
Co-authored-by: Joe Adams <github@joeadams.io>

* Bump github.com/DATA-DOG/go-sqlmock from 1.5.0 to 1.5.2 (prometheus-community#1000)

Bumps [github.com/DATA-DOG/go-sqlmock](https://github.com/DATA-DOG/go-sqlmock) from 1.5.0 to 1.5.2.
- [Release notes](https://github.com/DATA-DOG/go-sqlmock/releases)
- [Commits](DATA-DOG/go-sqlmock@v1.5.0...v1.5.2)

---
updated-dependencies:
- dependency-name: github.com/DATA-DOG/go-sqlmock
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump github.com/prometheus/client_golang from 1.18.0 to 1.19.0 (prometheus-community#1011)

Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.18.0 to 1.19.0.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/v1.19.0/CHANGELOG.md)
- [Commits](prometheus/client_golang@v1.18.0...v1.19.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump github.com/prometheus/client_model from 0.5.0 to 0.6.0 (prometheus-community#1010)

Bumps [github.com/prometheus/client_model](https://github.com/prometheus/client_model) from 0.5.0 to 0.6.0.
- [Release notes](https://github.com/prometheus/client_model/releases)
- [Commits](prometheus/client_model@v0.5.0...v0.6.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_model
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump google.golang.org/protobuf from 1.32.0 to 1.33.0 (prometheus-community#1014)

Bumps google.golang.org/protobuf from 1.32.0 to 1.33.0.

---
updated-dependencies:
- dependency-name: google.golang.org/protobuf
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump github.com/prometheus/exporter-toolkit from 0.10.0 to 0.11.0 (prometheus-community#992)

Bumps [github.com/prometheus/exporter-toolkit](https://github.com/prometheus/exporter-toolkit) from 0.10.0 to 0.11.0.
- [Release notes](https://github.com/prometheus/exporter-toolkit/releases)
- [Changelog](https://github.com/prometheus/exporter-toolkit/blob/master/CHANGELOG.md)
- [Commits](prometheus/exporter-toolkit@v0.10.0...v0.11.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/exporter-toolkit
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump golang.org/x/net from 0.20.0 to 0.23.0 (prometheus-community#1021)

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.20.0 to 0.23.0.
- [Commits](golang/net@v0.20.0...v0.23.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* feat: Add safe_wal_size and wal_status to replication_slot  (prometheus-community#1027)

* feat: Add safe_wal_size to replication_slot

Signed-off-by: MarcWort <113890636+MarcWort@users.noreply.github.com>

* feat: Add wal_status to replication_slot

Signed-off-by: MarcWort <113890636+MarcWort@users.noreply.github.com>

---------

Signed-off-by: MarcWort <113890636+MarcWort@users.noreply.github.com>

* fix: Only query active_time on pg>=14 (prometheus-community#1045)

Signed-off-by: MarcWort <113890636+MarcWort@users.noreply.github.com>

* Update README.md (prometheus-community#1038)

Better example for the quick start with prometheus config and avoiding deprecated env variables.

Signed-off-by: fhackenberger <florian@hackenberger.at>

* stop logging errors on replicas, fixes prometheus-community#547 (prometheus-community#1048)

Signed-off-by: Steffen Zieger <github@saz.sh>

* Update common Prometheus files (prometheus-community#983)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Update common Prometheus files (prometheus-community#1076)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* chore!: adopt log/slog, drop go-kit/log (prometheus-community#1073)

* ci: update go to version 1.23

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>

* build(deps): bump prometheus/{client_golang,common,exporter-toolkit}

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>

* chore!: adopt log/slog, drop go-kit/log

The bulk of this change set was automated by the following script which
is being used to aid in converting the various exporters/projects to use
slog:

https://gist.github.com/tjhop/49f96fb7ebbe55b12deee0b0312d8434

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>

---------

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
Co-authored-by: Ben Kochie <superq@gmail.com>

* Bump github.com/prometheus/client_golang from 1.20.4 to 1.20.5 (prometheus-community#1079)

Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.20.4 to 1.20.5.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](prometheus/client_golang@v1.20.4...v1.20.5)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump github.com/prometheus/common from 0.60.0 to 0.60.1 (prometheus-community#1080)

Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.60.0 to 0.60.1.
- [Release notes](https://github.com/prometheus/common/releases)
- [Changelog](https://github.com/prometheus/common/blob/main/RELEASE.md)
- [Commits](prometheus/common@v0.60.0...v0.60.1)

---
updated-dependencies:
- dependency-name: github.com/prometheus/common
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Make walreceiver collector useful w/o repmgr (prometheus-community#1086)

In a streaming replication setup that was created without replication
manager (`repmgr`), the `stat_wal_receiver` collector does not return
any metrics, because one value it wants to export is not present.

This is rather overly opinionated. The missing metric is comparatively
uninteresting and does not justify discarding all the others. And
replication setups created without `repmgr` are not exactly rare.

This commit makes the one relevant metric optional and simply skips it
if the respective value cannot be determined.

Signed-off-by: Conrad Hoffmann <ch@bitfehler.net>

* Update common Prometheus files (prometheus-community#1083)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Bump github.com/prometheus/exporter-toolkit from 0.13.0 to 0.13.1 (prometheus-community#1081)

Bumps [github.com/prometheus/exporter-toolkit](https://github.com/prometheus/exporter-toolkit) from 0.13.0 to 0.13.1.
- [Release notes](https://github.com/prometheus/exporter-toolkit/releases)
- [Changelog](https://github.com/prometheus/exporter-toolkit/blob/master/CHANGELOG.md)
- [Commits](prometheus/exporter-toolkit@v0.13.0...v0.13.1)

---
updated-dependencies:
- dependency-name: github.com/prometheus/exporter-toolkit
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update common Prometheus files (prometheus-community#1087)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Update changelog and version for a v0.16.0 release (prometheus-community#1088)

Signed-off-by: Joe Adams <github@joeadams.io>

* Fix version header in changelog (prometheus-community#1089)

Signed-off-by: Joe Adams <github@joeadams.io>

* Update pg_long_running_transactions.go (prometheus-community#1092)

To extract time in seconds for pg_long_running_transactions_oldest_timestamp_seconds query which currently return epoch time.

Signed-off-by: Jyothi Kiran Thammana <147131742+jyothikirant-sayukth@users.noreply.github.com>

* Fix to replace dashes with underscore in the metric names (prometheus-community#1103)

* Fix to replace dashes with underscore in the metric names

Signed-off-by: aagarwalla-fx <arpit.agarwalla@falconx.io>

* Code style fix

Signed-off-by: aagarwalla-fx <arpit.agarwalla@falconx.io>

---------

Signed-off-by: aagarwalla-fx <arpit.agarwalla@falconx.io>

* Bump github.com/prometheus/exporter-toolkit from 0.13.1 to 0.13.2 (prometheus-community#1108)

Bumps [github.com/prometheus/exporter-toolkit](https://github.com/prometheus/exporter-toolkit) from 0.13.1 to 0.13.2.
- [Release notes](https://github.com/prometheus/exporter-toolkit/releases)
- [Changelog](https://github.com/prometheus/exporter-toolkit/blob/master/CHANGELOG.md)
- [Commits](prometheus/exporter-toolkit@v0.13.1...v0.13.2)

---
updated-dependencies:
- dependency-name: github.com/prometheus/exporter-toolkit
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Checkpoint related columns in PG 17 have been moved from pg_stat_bgwriter to pg_stat_checkpointer (prometheus-community#1072)

* Checkpoint related columns in PG 17 have been moved from pg_stat_bgwriter to pg_stat_checkpointer

Fix prometheus-community#1060

See: https://www.dbi-services.com/blog/postgresql-17-new-catalog-view-pg_stat_checkpointer/
Signed-off-by: Nicolas Rodriguez <nico@nicoladmin.fr>

* Add support for pg_stat_checkpointer

See: https://www.dbi-services.com/blog/postgresql-17-new-catalog-view-pg_stat_checkpointer/
Signed-off-by: Nicolas Rodriguez <nico@nicoladmin.fr>

* Run integration tests with Postgres 17

Signed-off-by: Nicolas Rodriguez <nico@nicoladmin.fr>

* Update date in file header

Signed-off-by: Nicolas Rodriguez <nico@nicoladmin.fr>

---------

Signed-off-by: Nicolas Rodriguez <nico@nicoladmin.fr>

* Update common Prometheus files (prometheus-community#1090)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Update common Prometheus files (prometheus-community#1110)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Add Postgres 17 for CI test (prometheus-community#1105)

Signed-off-by: Khiem Doan <doankhiem.crazy@gmail.com>

* Bump github.com/prometheus/common from 0.61.0 to 0.62.0 (prometheus-community#1118)

Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.61.0 to 0.62.0.
- [Release notes](https://github.com/prometheus/common/releases)
- [Changelog](https://github.com/prometheus/common/blob/main/RELEASE.md)
- [Commits](prometheus/common@v0.61.0...v0.62.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/common
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update common Prometheus files (prometheus-community#1124)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* pg_stat_statements PG17 (prometheus-community#1114)

Signed-off-by: Nevermind <79126473+NevermindZ4@users.noreply.github.com>

* fix: handle pg_replication_slots on pg<13 (prometheus-community#1098)

* fix: handle pg_replication_slots on pg<13

Signed-off-by: Michael Todorovic <michael.todorovic@outlook.com>

* fix: tests

Signed-off-by: Michael Todorovic <michael.todorovic@outlook.com>

---------

Signed-off-by: Michael Todorovic <michael.todorovic@outlook.com>

* feat: add wait/backend to pg_stat_activity (prometheus-community#1106)

Signed-off-by: Felipe Galindo Sanchez <felipe.galindo.sanchez@intel.com>

* Export last replay age in replication collector (prometheus-community#1085)

The exported replication lag does not handle all failure modes, and can
report 0 for replicas that are out of sync and incapable of recovery.

A proper replacement for that metric would require a different approach
(see e.g. prometheus-community#1007), but for a lot of folks, simply exporting the age of
the last replay can provide a pretty strong signal for something being
amiss.

I think this solution might be preferable to prometheus-community#977, though the lag
metric needs to be fixed or abandoned eventually.

Signed-off-by: Conrad Hoffmann <ch@bitfehler.net>

* Bump github.com/prometheus/exporter-toolkit from 0.13.2 to 0.14.0 (prometheus-community#1126)

Bumps [github.com/prometheus/exporter-toolkit](https://github.com/prometheus/exporter-toolkit) from 0.13.2 to 0.14.0.
- [Release notes](https://github.com/prometheus/exporter-toolkit/releases)
- [Changelog](https://github.com/prometheus/exporter-toolkit/blob/master/CHANGELOG.md)
- [Commits](prometheus/exporter-toolkit@v0.13.2...v0.14.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/exporter-toolkit
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix missing dsn sanitization for logging (prometheus-community#1104)

This log line was not sanitized previously which could result in logging sensitive information. I have scanned the rest of the files and I don't see anywhere else that DSN is used in a log line without this filter.

Resolves prometheus-community#1042

Signed-off-by: Joe Adams <github@joeadams.io>

* Skip pg_stat_checkpointer collector if pg<17 (prometheus-community#1112)

* fix: skip collector if pg<17

Signed-off-by: Michael Todorovic <michael.todorovic@outlook.com>

* fix: better condition

Signed-off-by: Michael Todorovic <michael.todorovic@outlook.com>

* fix: fix PGStatCheckpointerCollector tests

Signed-off-by: Nicolas Rodriguez <nico@nicoladmin.fr>

---------

Signed-off-by: Michael Todorovic <michael.todorovic@outlook.com>
Signed-off-by: Nicolas Rodriguez <nico@nicoladmin.fr>
Co-authored-by: Michael Todorovic <michael.todorovic@outlook.com>

* Prep for v0.17 (prometheus-community#1127)

Signed-off-by: Joe Adams <github@joeadams.io>

* Fix: Handle incoming labels with invalid UTF-8 (prometheus-community#1131)

It's possible that incoming labels will contain invalid UTF-8 characters. This results in a panic. This fix sanitizes the label's string to ensure only valid UTF-8 characters are included, by replacing invalid characters with � (REPLACEMENT CHARACTER)

Signed-off-by: Cooper Worobetz <cooper@worobetz.ca>

* Release v0.17.1 (prometheus-community#1132)

* [BUGFIX] Fix: Handle incoming labels with invalid UTF-8 prometheus-community#1131

Signed-off-by: SuperQ <superq@gmail.com>

* Bump github.com/prometheus/client_golang from 1.20.5 to 1.21.0 (prometheus-community#1133)

Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.20.5 to 1.21.0.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](prometheus/client_golang@v1.20.5...v1.21.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update common Prometheus files (prometheus-community#1137)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Bump golang.org/x/net from 0.33.0 to 0.36.0 (prometheus-community#1138)

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.33.0 to 0.36.0.
- [Commits](golang/net@v0.33.0...v0.36.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update common Prometheus files (prometheus-community#1140)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Update common Prometheus files (prometheus-community#1142)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Adds pg_stat_progress_vacuum collector (prometheus-community#1141)

Signed-off-by: Ian Bibby <ian.bibby@reddit.com>
Co-authored-by: Ben Kochie <superq@gmail.com>

* Update Go (prometheus-community#1147)

* Update Go to 1.24.
* Update golangci-lint to v2.
* Fixup linting issues.

Signed-off-by: SuperQ <superq@gmail.com>

* Bump github.com/prometheus/client_golang from 1.21.0 to 1.21.1 (prometheus-community#1144)

Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.21.0 to 1.21.1.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](prometheus/client_golang@v1.21.0...v1.21.1)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-version: 1.21.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump github.com/prometheus/common from 0.62.0 to 0.63.0 (prometheus-community#1143)

Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.62.0 to 0.63.0.
- [Release notes](https://github.com/prometheus/common/releases)
- [Changelog](https://github.com/prometheus/common/blob/main/RELEASE.md)
- [Commits](prometheus/common@v0.62.0...v0.63.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/common
  dependency-version: 0.63.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Record table only size bytes as well in addition to the total size bytes (prometheus-community#1149)

* Record table only size bytes as well in addition to the total size bytes

Signed-off-by: Felix Yuan <felix.yuan@reddit.com>

* Update collector/pg_stat_user_tables.go

Co-authored-by: Ben Kochie <superq@gmail.com>
Signed-off-by: Felix Yuan <felix.yuan@reddit.com>

* Update collector/pg_stat_user_tables.go

Co-authored-by: Ben Kochie <superq@gmail.com>
Signed-off-by: Felix Yuan <felix.yuan@reddit.com>

* Finish renaming elements to index and table size

Signed-off-by: Felix Yuan <felix.yuan@reddit.com>

---------

Signed-off-by: Felix Yuan <felix.yuan@reddit.com>
Co-authored-by: Ben Kochie <superq@gmail.com>

* Bump golang.org/x/net from 0.36.0 to 0.38.0 (prometheus-community#1152)

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.36.0 to 0.38.0.
- [Commits](golang/net@v0.36.0...v0.38.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-version: 0.38.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

---------

Signed-off-by: prombot <prometheus-team@googlegroups.com>
Signed-off-by: Joe Adams <github@joeadams.io>
Signed-off-by: Eric tyrrell <eric.tyrrell18+github@gmail.com>
Signed-off-by: Daniel Swarbrick <daniel.swarbrick@gmail.com>
Signed-off-by: SuperQ <superq@gmail.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Alex Simenduev <shamil.si@gmail.com>
Signed-off-by: Jiri Sveceny <jiri.sveceny@icloud.com>
Signed-off-by: Keegan Carruthers-Smith <keegan.csmith@gmail.com>
Signed-off-by: Jocelyn Thode <jocelyn@thode.email>
Signed-off-by: Jocelyn Thode <jocelynthode@users.noreply.github.com>
Signed-off-by: MarcWort <113890636+MarcWort@users.noreply.github.com>
Signed-off-by: fhackenberger <florian@hackenberger.at>
Signed-off-by: Steffen Zieger <github@saz.sh>
Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
Signed-off-by: Conrad Hoffmann <ch@bitfehler.net>
Signed-off-by: Jyothi Kiran Thammana <147131742+jyothikirant-sayukth@users.noreply.github.com>
Signed-off-by: aagarwalla-fx <arpit.agarwalla@falconx.io>
Signed-off-by: Nicolas Rodriguez <nico@nicoladmin.fr>
Signed-off-by: Khiem Doan <doankhiem.crazy@gmail.com>
Signed-off-by: Nevermind <79126473+NevermindZ4@users.noreply.github.com>
Signed-off-by: Michael Todorovic <michael.todorovic@outlook.com>
Signed-off-by: Felipe Galindo Sanchez <felipe.galindo.sanchez@intel.com>
Signed-off-by: Cooper Worobetz <cooper@worobetz.ca>
Signed-off-by: Ian Bibby <ian.bibby@reddit.com>
Signed-off-by: Felix Yuan <felix.yuan@reddit.com>
Co-authored-by: PrometheusBot <prometheus-team@googlegroups.com>
Co-authored-by: Joe Adams <github@joeadams.io>
Co-authored-by: Eric Tyrrell <58529434+Eric-Tyrrell22@users.noreply.github.com>
Co-authored-by: Daniel Swarbrick <daniel.swarbrick@gmail.com>
Co-authored-by: Ben Kochie <superq@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Alex Simenduev <shamil.si@gmail.com>
Co-authored-by: Jiri Sveceny <jiri.sveceny@icloud.com>
Co-authored-by: Keegan Carruthers-Smith <keegan.csmith@gmail.com>
Co-authored-by: Jocelyn Thode <jocelynthode@users.noreply.github.com>
Co-authored-by: Marc W <113890636+MarcWort@users.noreply.github.com>
Co-authored-by: fhackenberger <florian@hackenberger.at>
Co-authored-by: Steffen Zieger <me@saz.sh>
Co-authored-by: TJ Hoplock <33664289+tjhop@users.noreply.github.com>
Co-authored-by: Conrad Hoffmann <1226676+bitfehler@users.noreply.github.com>
Co-authored-by: Jyothi Kiran Thammana <147131742+jyothikirant-sayukth@users.noreply.github.com>
Co-authored-by: aagarwalla-fx <arpit.agarwalla@falconx.io>
Co-authored-by: Nicolas Rodriguez <nico@nicoladmin.fr>
Co-authored-by: Khiem Doan <doankhiem.crazy@gmail.com>
Co-authored-by: Nevermind <79126473+NevermindZ4@users.noreply.github.com>
Co-authored-by: Michael Todorovic <michael.todorovic@outlook.com>
Co-authored-by: Felipe Galindo Sanchez <felipe.galindo.sanchez@intel.com>
Co-authored-by: vancwo <cooper@worobetz.ca>
Co-authored-by: Ian Bibby <470816+ianbibby@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants