Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wPys1-001C4t-2e for pgsql-bugs@arkaria.postgresql.org; Thu, 21 May 2026 08:35:10 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wPyrz-009glr-2T for pgsql-bugs@arkaria.postgresql.org; Thu, 21 May 2026 08:35:08 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wPyrz-009glj-12 for pgsql-bugs@lists.postgresql.org; Thu, 21 May 2026 08:35:08 +0000 Received: from mail-lf1-x130.google.com ([2a00:1450:4864:20::130]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wPyrx-000000004Wv-2gEz for pgsql-bugs@lists.postgresql.org; Thu, 21 May 2026 08:35:07 +0000 Received: by mail-lf1-x130.google.com with SMTP id 2adb3069b0e04-5a40cfab24dso638476e87.2 for ; Thu, 21 May 2026 01:35:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1779352503; cv=none; d=google.com; s=arc-20240605; b=Du7vWeh4s3D85u4hC3daAmvy7mfJoQa7CEl6b5LemaHNVWwXkgIJXdbatr7fzoOWOH 0uopdox/expT9yTUlK3+hBLgbqcXYYkXmamKt70yF49oatm2to2laIazoP5KdHfjLBZZ 2AMLWH1O5zT1ExqJngA4svVY28tmsy//Rxd1pYSxv5CYBQ+1KY9KlJPQL3cWcWksU3NS YTm1DZptPTjI5pToJ23d8z/88dSJ55+PaFggnza7iiXCdC6tkm3SUcWQvpsOgtniVAG4 ZLVUm1JwjVrY1k/bCdYIxhrIsl2MWByWDLHfxnf3NcgUkOT+y5XvcuFskdPwMuGMbpEK rzBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=KYSbLZg5YQ3LeUdwTLbDz+4jKQ5r+wAqYkH9F0YhRgo=; fh=/x4UPYjIq7MZzrlFYttBGQwpfR+l7/E9PAD29857khs=; b=iYBbMRroz7MrWEj4TBtlWp4d8j8GwdA6KQ6SlKOwxj9KlYKPPlDkAvbec67S85qGRa K5vS2jTtQwyrQyQLh50xOQduRnBAcqi82nALXMmQGXV4+9gL4+LgbBgAvHenNDeVjnkD OLJ84MQv/EJ6fjNHBXU/c/C9MVWChDolRfO0+4ZtTuv65rAQeenizlt8WNg35/L/OBxf rZlLtOA1iZVLZjVerNi2xsgifHugu/WinH1+YDQeHEsSuX97935aU5Yg75hXGMIbUjIF ENq2ffYmDy0N/dGYeF+uibOIpHuG8HghaQ0Zp7MJ//V+lLeTFr4H7dJX6Rj87KK+pcWb 9duw==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=boringsql.com; s=google; t=1779352503; x=1779957303; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=KYSbLZg5YQ3LeUdwTLbDz+4jKQ5r+wAqYkH9F0YhRgo=; b=CQmzpQ+RKk7AaPK6lmjbyIk4pALTMIKt4MUVt9J79m+e6QEXSwwRzaLnpNiKSZWwVm lbizEAI5nNCjKmNDdbsA2vUMWV2GSPQDuT30+lJe7+Ggq9rD5OhK5tslW1Ti/XfZJ5sE my/fAFSuxCEGVp1XvU2342Amy0BXRBxIp9IimcKjAfvNjyC66sQQpH/E2mHONOmnYX2C v21X4qTWy3w3ZiIevuyz1Q5lHaZ8d72T1YbxtXdi9ScgSHZ6xRehrTzQ/fHP04yvjaWL 9TBoC3jdadqMU/f7T6C2oFc/zSvQjbUgYYO1cABKyQnOs+Q3EgCOfn93ZvgPi/HhRgRT oDqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779352503; x=1779957303; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=KYSbLZg5YQ3LeUdwTLbDz+4jKQ5r+wAqYkH9F0YhRgo=; b=NwWezTm50hsOTszu2GDmdzvDN4nTId27jGqfm4MbTw2LfEB7dEntGkFJiuTgM/nBt9 UXBhfoMi7dsc/tsOhsEtQRIyi+JwxF04533+YMlWHgR2CjZQyDETJb+WZHIxT+QHYoXo 9sH10AaE7j+tisJUAGrjD69HTVXBVforo3L8Q0OphcAc10RyRywpxIsXuvL0SCIxVcyy NmHvAF/Ir/iHWY92DEZ3+ZSCxZbEA2pZ3Aao+tmiF+w5DUOg+LCLE3xMM5rmQQbrpDO3 E8Tshyu/8VEbKG2zCmzCMG40fmtyQBTHxQutGNtxfshcDP9mwg6RISki0Zg4NTpqh0il gqew== X-Forwarded-Encrypted: i=1; AFNElJ8Iz2PVWxM3+Q+y5eLmQcRxZoxTf1k7xV8oFVnlSOlIlUlf0wdCE1BwwOfS0bIuXBAelgQx36P9i43R@lists.postgresql.org X-Gm-Message-State: AOJu0Yy7ykwLoRZCgzU/lllZOzQ2S4BDKTe0721Q1XlQJYIgNy34EEo7 WMhCAxVeMihukthllkBXxtC0liOAvudiwiRRoJO8DrLa1KtRQedBRKhsw6UBaFwpK8Nf2OZ+cT5 PREY+ShyZwutY33Ugf+pGc44umIDkBNJbXqSD22TJCQ== X-Gm-Gg: Acq92OHVFlLThbxSHvznIwh1iHPvyovaDXdXKtcyf73MiSxJUbMBWMP0l2bWN608LJ5 tZUe8Q9TUIc6BUEIF543E6YNgjNOosj4BGXTXJ9PcGCr9hVJQE4VRuMrj4hDzVmsJ0w1MPJzn6t OvVNNRIBYMSPwQlifOsNNnbZeQhMmDe0jsupJvyo1LDftbcySBpuazkRE+FNPNgGLK1ZQXTvJjZ T2wt11XxsSvmuePCl6r67xr+PPH/D1z7WfXBYR/eDZMpeCj8khInRP5w/97kJjbuDFJYtvOqUE6 +/IFJdJWbkOR68iJIrFDXzyt54Klz/s9z3ZLsIvwO+YDSjj1oBCyqoKX+D5rTt32u6PtenmVCpH sLZTRmWEEMENokssFrqmTPGCVHS3e+JrxH7UL X-Received: by 2002:a05:6512:4888:b0:5a3:fd83:13f7 with SMTP id 2adb3069b0e04-5aa2ba68dd4mr478975e87.6.1779352502054; Thu, 21 May 2026 01:35:02 -0700 (PDT) MIME-Version: 1.0 References: <19490-9c59c6a583513b99@postgresql.org> <46FE61C9-F273-45FD-BED7-0F8CDA6EB992@yandex-team.ru> <46DB3CAB-EA1C-41A5-9D6D-5F913A2AAF66@yandex-team.ru> In-Reply-To: <46DB3CAB-EA1C-41A5-9D6D-5F913A2AAF66@yandex-team.ru> From: Radim Marek Date: Thu, 21 May 2026 10:34:49 +0200 X-Gm-Features: AVHnY4Lczq6EPxkq94gY5_n6Gu5DelLtQzz2ArY03VksYP6NXCNfIAP5kU_WYx4 Message-ID: Subject: Re: BUG #19490: Streaming standby on 16.14 stops applying WAL on MultiXactOffsetSLRU when primary is 16.8 To: Andrey Borodin Cc: Marko Tiikkaja , PostgreSQL mailing lists Content-Type: multipart/alternative; boundary="000000000000d0c20506524fc986" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000d0c20506524fc986 Content-Type: text/plain; charset="UTF-8" Thank you for the follow-up. In mean-time I can confirm the commit 77dff5d937b1 might be the source of the original reported issue. Unfortunately pinning version down to 16.12 only avoids the MultiXactOffsetSLRU self-deadlock, but the standby then fails recovery after 12+ hours. FATAL: could not access status of transaction 24958976 DETAIL: Could not read from file "pg_multixact/offsets/017C" at offset 221184: read too few bytes. CONTEXT: WAL redo at 14770/873268E8 for MultiXact/CREATE_ID: 24958975 offset 61500431 nmembers 2: 3058927188 (fornokeyupd) 3058927189 (keysh) We are going to try to pin 16.13 and try that before we can safely upgrade of the primary/are confident we have working PITR recovery available should we need it. Radim PS: Once I have some time I will try to setup a docker based harness to be able to replicate original problem for later testing of the fix. On Thu, 21 May 2026 at 09:25, Andrey Borodin wrote: > > > > On 21 May 2026, at 00:12, Marko Tiikkaja wrote: > > > > #8 0x0000654c8ae2acba in SimpleLruWriteAll (ctl=0x654c8b63e400 > > Thanks! > > This clearly points to SimpleLruWriteAll() added in 77dff5d937b1. > If by chance you will have a backtrace of another deadlocking process - > please post it. > > But it's not strictly necessary for analysis, I think we can figure out > what > happened from the backtrace you already posted. > > > Best regards, Andrey Borodin. > --000000000000d0c20506524fc986 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thank you for the follow-up. In mean-time I can confi= rm the commit=C2=A077dff5d937b1 might be the source of the original reporte= d issue.

Unfortunately pinning version down to 16.= 12 only=C2=A0avoids the MultiXactOffsetSLRU self-deadlock, but the standby = then fails recovery after 12+ hours.

FATAL: could not access status o= f transaction 24958976 DETAIL: Could not read from file "pg_multixact/offsets/017C" a= t offset 221184: read too few bytes. CONTEXT: WAL redo at 14770/873268E8 for MultiXact/CREATE_ID: 24958975 of= fset 61500431 nmembers 2: 3058927188 (fornokeyupd) 3058927189 (keysh)
=

We are going= to try to pin 16.13 and try that before we can safely upgrade of the prima= ry/are confident we have working PITR recovery available should we need it.=

Radim

PS: Once I have so= me time I will try to setup a docker based harness to be able to replicate = original problem for later testing of the fix.

O= n Thu, 21 May 2026 at 09:25, Andrey Borodin <x4mmm@yandex-team.ru> wrote:


> On 21 May 2026, at 00:12, Marko Tiikkaja <marko@joh.to> wrote:
>
> #8=C2=A0 0x0000654c8ae2acba in SimpleLruWriteAll (ctl=3D0x654c8b63e400=

Thanks!

This clearly points to SimpleLruWriteAll() added in 77dff5d937b1.
If by chance you will have a backtrace of another deadlocking process -
please post it.

But it's not strictly necessary for analysis, I think we can figure out= what
happened from the backtrace you already posted.


Best regards, Andrey Borodin.
--000000000000d0c20506524fc986--