Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wPW2u-000maP-09 for pgsql-hackers@arkaria.postgresql.org; Wed, 20 May 2026 01:48:28 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wPW2q-005ad0-0L for pgsql-hackers@arkaria.postgresql.org; Wed, 20 May 2026 01:48:25 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wPW2p-005acr-2X for pgsql-hackers@lists.postgresql.org; Wed, 20 May 2026 01:48:24 +0000 Received: from mail-pf1-x434.google.com ([2607:f8b0:4864:20::434]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wPW2o-00000000T9n-0BuI for pgsql-hackers@postgresql.org; Wed, 20 May 2026 01:48:24 +0000 Received: by mail-pf1-x434.google.com with SMTP id d2e1a72fcca58-834f1075805so3331097b3a.2 for ; Tue, 19 May 2026 18:48:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779241699; x=1779846499; darn=postgresql.org; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Rh5yxnjkDDA/WfHzpwizxPNzwSmy6zj2ZiyJZUNFmxw=; b=HpgSbL2GXl53fZbqchASLd3VR0l6Nepp7FELEmh7CLlyIlNBKzaRvDYXqhXeQuYm77 lFaMw290McO6CJWpMShJVM2IgWYoFib+8fegkid6eMZq4B4P/vJIvOyUZm2M1+8YLInN Nj+p+s4+Qfimt/fH3Je9Ey5iofpNTXZyBI6odURhp46BLXToESvPjw2oGcq8NyKT0SGH rINBS1YmD/WIZB0wPgCiBrGRihPmDkmxOUgwAwGNwmIeJA/36S5QWZYYnL91o/R0zyqx ymWxLNkM+SdfUF+9SYiUUd1/dCmlBlbY+aSIQ2IoH8H5DfzrvpeUsoWQcirE17bKAvDJ Stkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779241699; x=1779846499; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-gg:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=Rh5yxnjkDDA/WfHzpwizxPNzwSmy6zj2ZiyJZUNFmxw=; b=A8g6P8qr+bn30zYoz/jhh+r8r7Il8IXwX8+aLWZqjfivIQgTMbFrGHpsoo9Hnqer4F d6uZakP2ayErsYeFp+ooyadyBXVPsL1Q228LJV8hKZpzdtzi5lTZgmpL06NW0qrJI/f2 Qv5EIMWE89RMqxagPOZ6a/bFDvcmYJzqyAaXLwKg7fPu0wEq6bd3/JgYILAVfPqhlDJC h/ah0iS0iCjkwTA6fbqUqmSCPj28obY272e8mOgFPIQ55nd+NiBkkiP+pv1cistNbLWH ezKxsT4kFyCuh1EsO8Usx10d+e7Ofx5qWBp9LubvKQXJUnqcYhTO/rYuA++kKDz9k22s 5mWA== X-Gm-Message-State: AOJu0YxA32ORhEXP0F5dcIOJ2oPvKIU37hclzbGCnupehsOLdq7B9Itd 1iLJ1ELqyrZMylNJUoAxf+Vdk2UV3le2Ofidr5K/4vLuXhydSbmaEWxc X-Gm-Gg: Acq92OHNzSCJhyJqO+4J8ThtHNTj5fIdMbQNKqfaIo3rilIuEGww6DukT1+2XF12Xmf l2QCT17woukorOJ6VgH6zxuTvWPvvs+w+ArQuLGpyKbYWJE/KRbMOM9H8FQOhMu7e8ZzQB8T089 3dH+sxyCsiG6GonaXKdFRolge7YMuKpsTa4A6XcY9l1IFJJmiss2TR4J6YaUQSIvc/8u8IHLyj0 6xDE/Og+jWSpt1Ykhx65cX9+APiPlXyzamx3Js5DU1CViUaQS0G4Wf42OPzFrsTWnjjcfFOZo3o uYx5pAQvfeotNs4+X43vyIVoeTWHOCdgmzYVbUvEkadjCufEZC1WXsXd3ljLQi/kIK3Q6PALyro vaJs1zTDAVNmDl0G1Ua3Kq45qvv+6XbgSbkej3e6EMnenH1CyLjWvKwBhUmN4i1O/PDO2t1Upln bE/Bsa6kWmqU5XCoYEepYJ50bTpyyZQaU= X-Received: by 2002:a05:6a00:ac86:b0:82a:805a:7e2 with SMTP id d2e1a72fcca58-83f33c4ed90mr21590051b3a.9.1779241699345; Tue, 19 May 2026 18:48:19 -0700 (PDT) Received: from smtpclient.apple ([45.32.121.103]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-83f8b9c2ea5sm5486195b3a.13.2026.05.19.18.48.17 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 May 2026 18:48:18 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3864.400.21\)) Subject: Re: Fix pg_stat_wal_receiver to show CONNECTING status From: Chao Li In-Reply-To: Date: Wed, 20 May 2026 09:47:40 +0800 Cc: PostgreSQL-development , Michael Paquier , Xuneng Zhou Content-Transfer-Encoding: quoted-printable Message-Id: <1F153E64-B791-42FA-A60A-64813B20B81E@gmail.com> References: To: Michael Paquier X-Mailer: Apple Mail (2.3864.400.21) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk > On May 19, 2026, at 21:55, Michael Paquier = wrote: >=20 > On Tue, May 19, 2026 at 01:55:14PM +0800, Chao Li wrote: >> I also tried restarting the standby server, and the result was the = same. >>=20 >> The problem is that pg_stat_wal_receiver is gated by >> WalRcv->ready_to_display, and when the status is CONNECTING, >> WalRcv->ready_to_display is false. >=20 > Initially, I was thinking that the walrcv_connect() delay would not be > that important to track in this context, but you are right that this > stands for improvement before the release. >=20 > @@ -1474,21 +1474,10 @@ pg_stat_get_wal_receiver(PG_FUNCTION_ARGS) > - if (pid =3D=3D 0 || !ready_to_display) > + /* No WAL receiver, just return a tuple with NULL values */ > + if (pid =3D=3D 0) > PG_RETURN_NULL(); >=20 > This suggestion is making the SQL function call feebler, IMO, > impacting the readability around ready_to_display that we want to act > as a gate to the data provided in the view. This flag is important to > check at an early state of the function call, and I don't really want > to change that. A better thing to do would be to split into two steps > how the WAL receiver data is filled between the walrcv_connect() call: > 1) Before the call, reset all the connection-related fields because > they are not relevant before the connection to the remote is > completed, set ready_for_display to true to make the connecting state > visible in the view. The connection information does not matter > anyway here: we cannot be sure which point we are connected to until > the connection is fully established. > 2) After the call, fill in the connection-related fields. >=20 > This means taking twice the WAL receiver spinlock instead of once, > which is not going to matter in practice as the latency of the > connection attempt is much larger than that. >=20 > What do you think about the attached, then? > -- > Michael > Hi Micheal, Thanks for your patch. I just read v2, and it is actually the first solution I tried. The = reason I gave up on that approach and switched to the implementation in = v1 is that it may wrongly report last_msg_send_time, = last_msg_receipt_time, and latest_end_time. See my test with v2: ``` evantest=3D# SELECT * FROM pg_stat_wal_receiver; pid | status | receive_start_lsn | receive_start_tli | = written_lsn | flushed_lsn | received_tli | last_msg_send_time = | last_msg_receipt_time | latest_end_lsn | = latest_end_time | slot_name | sender_host | sender_port | = conninfo = -------+------------+-------------------+-------------------+-------------= +-------------+--------------+-------------------------------+------------= -------------------+----------------+-------------------------------+-----= ------+-------------+-------------+---------- 83930 | connecting | 0/03000000 | 1 | 0/03000000 = | 0/03000000 | 1 | 2026-05-20 09:24:09.121679+08 | = 2026-05-20 09:24:09.121679+08 | | 2026-05-20 = 09:24:09.121679+08 | | | | (1 row) evantest=3D# \c You are now connected to database "evantest" as user "chaol". evantest=3D# SELECT * FROM pg_stat_wal_receiver; pid | status | receive_start_lsn | receive_start_tli | = written_lsn | flushed_lsn | received_tli | last_msg_send_time = | last_msg_receipt_time | latest_end_lsn | = latest_end_time | slot_name | sender_host | sender_port | = conninfo = -------+------------+-------------------+-------------------+-------------= +-------------+--------------+-------------------------------+------------= -------------------+----------------+-------------------------------+-----= ------+-------------+-------------+---------- 84709 | connecting | 0/03000000 | 1 | 0/03000000 = | 0/03000000 | 1 | 2026-05-20 09:27:37.407117+08 | = 2026-05-20 09:27:37.407117+08 | | 2026-05-20 = 09:27:37.407117+08 | | | | (1 row) evantest=3D# \c You are now connected to database "evantest" as user "chaol". evantest=3D# SELECT * FROM pg_stat_wal_receiver; pid | status | receive_start_lsn | receive_start_tli | = written_lsn | flushed_lsn | received_tli | last_msg_send_time = | last_msg_receipt_time | latest_end_lsn | = latest_end_time | slot_name | sender_host | sender_port | = conninfo = -------+------------+-------------------+-------------------+-------------= +-------------+--------------+-------------------------------+------------= -------------------+----------------+-------------------------------+-----= ------+-------------+-------------+---------- 84805 | connecting | 0/03000000 | 1 | 0/03000000 = | 0/03000000 | 1 | 2026-05-20 09:28:03.251298+08 | = 2026-05-20 09:28:03.251298+08 | | 2026-05-20 = 09:28:03.251298+08 | | | | (1 row) ``` As shown above, every time I restarted the standby server, = last_msg_send_time, last_msg_receipt_time, and latest_end_time were = updated to the standby server start time. But in this test, the standby = was connecting to a fake primary, so no WAL receiver message had been = sent or received. I tried to avoid more complicated changes, so I ended up with the v1 = approach. I think it's okay to leave the other columns NULL while the = receiver is still connecting, because at that point the only reliable = information available is the receiver process's PID and status. For v1, maybe we could clarify the meaning of ready_to_display with a = comment. It seems to be intended to indicate that the connection-related = information, such as LSNs and timestamps, is ready to display. In that = sense, pid and status don't need to be gated by it. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/