public inbox for pgsql-hackers@postgresql.org
help / color / mirror / Atom feedFrom: Michael Paquier <michael@paquier.xyz>
To: Chao Li <li.evan.chao@gmail.com>
Cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
Cc: Michael Paquier <michael.paquier@gmail.com>
Cc: Xuneng Zhou <xunengzhou@gmail.com>
Subject: Re: Fix pg_stat_wal_receiver to show CONNECTING status
Date: Tue, 19 May 2026 22:55:39 +0900
Message-ID: <agxr29Hsz7FjxzlN@paquier.xyz> (raw)
In-Reply-To: <EF91FF76-1E2B-4F3B-9162-290B4DC517FF@gmail.com>
References: <EF91FF76-1E2B-4F3B-9162-290B4DC517FF@gmail.com>
On Tue, May 19, 2026 at 01:55:14PM +0800, Chao Li wrote:
> I also tried restarting the standby server, and the result was the same.
>
> The problem is that pg_stat_wal_receiver is gated by
> WalRcv->ready_to_display, and when the status is CONNECTING,
> WalRcv->ready_to_display is false.
Initially, I was thinking that the walrcv_connect() delay would not be
that important to track in this context, but you are right that this
stands for improvement before the release.
@@ -1474,21 +1474,10 @@ pg_stat_get_wal_receiver(PG_FUNCTION_ARGS)
- if (pid == 0 || !ready_to_display)
+ /* No WAL receiver, just return a tuple with NULL values */
+ if (pid == 0)
PG_RETURN_NULL();
This suggestion is making the SQL function call feebler, IMO,
impacting the readability around ready_to_display that we want to act
as a gate to the data provided in the view. This flag is important to
check at an early state of the function call, and I don't really want
to change that. A better thing to do would be to split into two steps
how the WAL receiver data is filled between the walrcv_connect() call:
1) Before the call, reset all the connection-related fields because
they are not relevant before the connection to the remote is
completed, set ready_for_display to true to make the connecting state
visible in the view. The connection information does not matter
anyway here: we cannot be sure which point we are connected to until
the connection is fully established.
2) After the call, fill in the connection-related fields.
This means taking twice the WAL receiver spinlock instead of once,
which is not going to matter in practice as the latency of the
connection attempt is much larger than that.
What do you think about the attached, then?
--
Michael
From 3c381a90b1270fdd3f1b01e8eefb85f1ac4af3d8 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Tue, 19 May 2026 22:52:38 +0900
Subject: [PATCH v2] Improve pg_stat_wal_receiver for CONNECTING status
Commit a36164e7465 added a CONNECTING status for the WAL receiver, but
pg_stat_wal_receiver returned no information while the connection to the
primary was attempted, limiting the usability of the feature in
high-latency environments where the connection attempt to the primary
could take time.
This commit improves the report of the status by splitting the way the
shared memory state of the WAL receiver is filled before and after the
connection to the primary is attempted:
- Before the attempt, reset all the connection fields, switch
ready_to_display to true.
- After the attempt, fill in the connection fields.
This change means two spinlock acquisitions instead of one, but at least
monitoring tools can know about the connection attempt before its
completion, enlarging the usability of the feature.
Reported-by: Chao Li <li.evan.chao@gmail.com>
Author: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/XXX
---
src/backend/replication/walreceiver.c | 24 ++++++++++++++++--------
1 file changed, 16 insertions(+), 8 deletions(-)
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 07eac07b9ce4..d19317703c1f 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -267,6 +267,20 @@ WalReceiverMain(const void *startup_data, size_t startup_data_len)
/* Unblock signals (they were blocked when the postmaster forked us) */
sigprocmask(SIG_SETMASK, &UnBlockSig, NULL);
+ /*
+ * Switch the WAL receiver state as ready for display before doing a
+ * connection attempt, so as its connecting state is visible before
+ * attempting to contact the primary server. Note that this resets the
+ * original conninfo, sender_port and sender_host, for security. These
+ * fields are filled once the connection is fully established.
+ */
+ SpinLockAcquire(&walrcv->mutex);
+ memset(walrcv->conninfo, 0, MAXCONNINFO);
+ memset(walrcv->sender_host, 0, NI_MAXHOST);
+ walrcv->sender_port = 0;
+ walrcv->ready_to_display = true;
+ SpinLockRelease(&walrcv->mutex);
+
/* Establish the connection to the primary for XLOG streaming */
appname = cluster_name[0] ? cluster_name : "walreceiver";
wrconn = walrcv_connect(conninfo, true, false, false, appname, &err);
@@ -277,23 +291,17 @@ WalReceiverMain(const void *startup_data, size_t startup_data_len)
appname, err)));
/*
- * Save user-visible connection string. This clobbers the original
- * conninfo, for security. Also save host and port of the sender server
- * this walreceiver is connected to.
+ * Save user-visible connection string, now that the connection has been
+ * achieved.
*/
tmp_conninfo = walrcv_get_conninfo(wrconn);
walrcv_get_senderinfo(wrconn, &sender_host, &sender_port);
SpinLockAcquire(&walrcv->mutex);
- memset(walrcv->conninfo, 0, MAXCONNINFO);
if (tmp_conninfo)
strlcpy(walrcv->conninfo, tmp_conninfo, MAXCONNINFO);
-
- memset(walrcv->sender_host, 0, NI_MAXHOST);
if (sender_host)
strlcpy(walrcv->sender_host, sender_host, NI_MAXHOST);
-
walrcv->sender_port = sender_port;
- walrcv->ready_to_display = true;
SpinLockRelease(&walrcv->mutex);
if (tmp_conninfo)
--
2.54.0
Attachments:
[text/plain] v2-0001-Improve-pg_stat_wal_receiver-for-CONNECTING-statu.patch (3.4K, 2-v2-0001-Improve-pg_stat_wal_receiver-for-CONNECTING-statu.patch)
download | inline diff:
From 3c381a90b1270fdd3f1b01e8eefb85f1ac4af3d8 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Tue, 19 May 2026 22:52:38 +0900
Subject: [PATCH v2] Improve pg_stat_wal_receiver for CONNECTING status
Commit a36164e7465 added a CONNECTING status for the WAL receiver, but
pg_stat_wal_receiver returned no information while the connection to the
primary was attempted, limiting the usability of the feature in
high-latency environments where the connection attempt to the primary
could take time.
This commit improves the report of the status by splitting the way the
shared memory state of the WAL receiver is filled before and after the
connection to the primary is attempted:
- Before the attempt, reset all the connection fields, switch
ready_to_display to true.
- After the attempt, fill in the connection fields.
This change means two spinlock acquisitions instead of one, but at least
monitoring tools can know about the connection attempt before its
completion, enlarging the usability of the feature.
Reported-by: Chao Li <li.evan.chao@gmail.com>
Author: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/XXX
---
src/backend/replication/walreceiver.c | 24 ++++++++++++++++--------
1 file changed, 16 insertions(+), 8 deletions(-)
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 07eac07b9ce4..d19317703c1f 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -267,6 +267,20 @@ WalReceiverMain(const void *startup_data, size_t startup_data_len)
/* Unblock signals (they were blocked when the postmaster forked us) */
sigprocmask(SIG_SETMASK, &UnBlockSig, NULL);
+ /*
+ * Switch the WAL receiver state as ready for display before doing a
+ * connection attempt, so as its connecting state is visible before
+ * attempting to contact the primary server. Note that this resets the
+ * original conninfo, sender_port and sender_host, for security. These
+ * fields are filled once the connection is fully established.
+ */
+ SpinLockAcquire(&walrcv->mutex);
+ memset(walrcv->conninfo, 0, MAXCONNINFO);
+ memset(walrcv->sender_host, 0, NI_MAXHOST);
+ walrcv->sender_port = 0;
+ walrcv->ready_to_display = true;
+ SpinLockRelease(&walrcv->mutex);
+
/* Establish the connection to the primary for XLOG streaming */
appname = cluster_name[0] ? cluster_name : "walreceiver";
wrconn = walrcv_connect(conninfo, true, false, false, appname, &err);
@@ -277,23 +291,17 @@ WalReceiverMain(const void *startup_data, size_t startup_data_len)
appname, err)));
/*
- * Save user-visible connection string. This clobbers the original
- * conninfo, for security. Also save host and port of the sender server
- * this walreceiver is connected to.
+ * Save user-visible connection string, now that the connection has been
+ * achieved.
*/
tmp_conninfo = walrcv_get_conninfo(wrconn);
walrcv_get_senderinfo(wrconn, &sender_host, &sender_port);
SpinLockAcquire(&walrcv->mutex);
- memset(walrcv->conninfo, 0, MAXCONNINFO);
if (tmp_conninfo)
strlcpy(walrcv->conninfo, tmp_conninfo, MAXCONNINFO);
-
- memset(walrcv->sender_host, 0, NI_MAXHOST);
if (sender_host)
strlcpy(walrcv->sender_host, sender_host, NI_MAXHOST);
-
walrcv->sender_port = sender_port;
- walrcv->ready_to_display = true;
SpinLockRelease(&walrcv->mutex);
if (tmp_conninfo)
--
2.54.0
[application/pgp-signature] signature.asc (833B, 3-signature.asc)
download
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: pgsql-hackers@postgresql.org
Cc: michael@paquier.xyz, li.evan.chao@gmail.com, michael.paquier@gmail.com, xunengzhou@gmail.com
Subject: Re: Fix pg_stat_wal_receiver to show CONNECTING status
In-Reply-To: <agxr29Hsz7FjxzlN@paquier.xyz>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox