Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w3eAF-001Q75-2K for pgsql-bugs@arkaria.postgresql.org; Fri, 20 Mar 2026 18:01:39 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w3eAD-007e7x-2w for pgsql-bugs@arkaria.postgresql.org; Fri, 20 Mar 2026 18:01:38 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w3eAD-007e7p-1x for pgsql-bugs@lists.postgresql.org; Fri, 20 Mar 2026 18:01:38 +0000 Received: from mail-ot1-x329.google.com ([2607:f8b0:4864:20::329]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w3eAA-00000000DeF-4Bbl for pgsql-bugs@lists.postgresql.org; Fri, 20 Mar 2026 18:01:37 +0000 Received: by mail-ot1-x329.google.com with SMTP id 46e09a7af769-7d75371d873so962224a34.3 for ; Fri, 20 Mar 2026 11:01:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1774029693; cv=none; d=google.com; s=arc-20240605; b=a8GmBI353nlNu+zQ1SreOlW9NclxKCa0d4xB89BeplCfNyhStCxD/mwQ4EPD9bkSL+ jo9t+4cneGF2AulddDa5NLQDhNIovNbvKW2GES+k8Jn5KZ9npmwRoQw6oj0EWSbuqxYA +Dk9kXvVkIKGEine1eUFRU9XS45LTyeKgxt+WX+Jeb24bfHtKIn+HD6FkoseFU6lADaX uvLcG0etEdzOeDMiSdHxpS5NmTFvhjCmJvdhHvIgqk2AGBgqFPCNoRdpV5OVrNgfWckb 3IQUwFR5g3wfcqzZyt1pSWCRUZruJN+oiF6dAv7oyUHpUNjlQK6CO35CRHoMPX7faB1U q36g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=ot6YQCyeort9hVl454eudjyAm8C860O48VFEmbZmRaE=; fh=IPl7NHR3wyIUJ5APvad9pI9D3BBPMlLIn7euR+G2d/U=; b=D6LcXDskgqZIuMBbl8JjdvkgMXW0aaVavY5ia8wqZiPchKLHRDaqV9G5s5zcs2qatA dpvOrfOBbN3xIV3thLEOPXfYUI9ea96DcpOckzOmPQDo6w0t7ZKeYBxf8V6DDqKjrlRK 7iICPa19sSrVlj/55m7Ug/XLDiHCx4vKw6CqNxJyG5cgb48EGblAVZzdq+2yYP2a2uRT 15bQz+28jQQk2Ss/8ky5d0SPKmD/BkE8zRKa9vaNiUoDx85a9OmMIxjVRevwhsdiT2pl e1StCZhkMCCmMl7o2KQkAAM14SKmRiTFIG9A994L3bfg8il4CVTEEdyA0TErWz5x02G9 3dmA==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1774029693; x=1774634493; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=ot6YQCyeort9hVl454eudjyAm8C860O48VFEmbZmRaE=; b=LzAWc+NqWypyM2jNPPnn5fq9UGph8QsEjM0BMeZN11MmerYMA0neaIfDrPCixOJut/ AWGL5+8HbbRDCBg+Rq/zrC3RhlyqE5EID6avnAHnSninq1vBHg3lEELIqR/H2wP9uui4 0RXaYqRCwTnUrm8L1Yuq9PkuVQ4YfyRyhlvmcz38QxeOcLXz0GdL1i/s4rlhKROqgA6h kn01Jgcy/bKOS5CcGsaH+lZwjt67FB95ddYRPrfMxal4gZhWve1vOn8TYlWnsUvgxa45 +fVnBAKdngSOzK9/UIFRotsmxcPcMbYF+Y0/yKGkcYTRRygm7FO8F3dUt1wyDT2rdO+v oM7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774029693; x=1774634493; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=ot6YQCyeort9hVl454eudjyAm8C860O48VFEmbZmRaE=; b=J++8s3bFsTHaqK5fWPtac2dIWOh7reO7EGlXU9dnco1dREEFZofOz1F2/8qGtYSb+8 Zhuj8Z5g0wsBOyXXK81ewZKMxSxUg5GvglSqp8I94N308o6Fd2k2C+b0o0qx7FQCAaSC yfTPPVHIEFHZQfDtN6Iau3+H2GESoYG8h/VejHrZDbnqHDKIEbvaw+jK4lgTuXPITiOD w2w4vqnZfybrHSTurv0IhsAfumD0M+cjwIt/TL/5TLiilH7BhZvhO+l2gpV5VNnIFsbf IxbN8HxgtBAuXFr9+vqeZoSdYwI1AsyKZ5/4SWlv/nUlNq7z2D1zIa0blKuWcpV2KPGR ixsQ== X-Forwarded-Encrypted: i=1; AJvYcCUSqSJfoOOadsbjinBUaHFPleLgul6TE7lktT5JDoFsi9y3MSxqVwS6Nud3RG7Om7yGukmPcrplImCS@lists.postgresql.org X-Gm-Message-State: AOJu0YzsY6BziFbVzvgHN8HpgSyNRMNiahKvbj59zjdJSaPY/yXIFSoY Exg0nDsSF8YEt/3IHjGBfHXSfCsAg6OTV9uEaHLyWmrAQM5e4R0DFQ6zh4G1vziXSiFR2E7Fq2T B6ExjCOHbbTlNpgECqZp7DUDpSIhVjRI= X-Gm-Gg: ATEYQzy93wzn6QPQTOO97BrJWAInfqboYAszbin+LVaHrvEqRAWWJxyCGMEfrW31bWa DOe3y0S6xkNflRVlk7vpcvIjWfQS0KwtMgUl/0RbpL72LIAARWVymcH1vpwkU3Q7fqCMbLDsySW nduSOQ+3B42O/VaWM0bSKsrV3Pq13MXcSN8SJnBrpfzAy9olktR5EbgZ7Pj75c6NS3O1Nn3FGKW rMkFBFxZfc93crw9Rg7kdaI3JNv/kl0zMOlT1YbMseBVIb28Hv3ZtSpj7Mq/+89vzvUhfDiAbpY OgqxhmCn6qC1XQ== X-Received: by 2002:a05:6830:6d11:b0:7d7:3b11:467f with SMTP id 46e09a7af769-7d7eb00b3c5mr2723364a34.32.1774029693145; Fri, 20 Mar 2026 11:01:33 -0700 (PDT) MIME-Version: 1.0 References: <40d694df-39fd-4a4a-9459-9d6489165f60@gogi.tv> In-Reply-To: From: surya poondla Date: Fri, 20 Mar 2026 11:01:22 -0700 X-Gm-Features: AaiRm50EOoR5P9i9euyHDi-9ej56lemb05r_JkOpZ05_QW7YPDKIc85kcpjjOjU Message-ID: Subject: Re: Two issues with REFRESH MATERIALIZED VIEW CONCURRENTLY To: cca5507 Cc: Rafia Sabih , Giuliano Gagliardi , pgsql-bugs Content-Type: multipart/alternative; boundary="000000000000ae5714064d78799b" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000ae5714064d78799b Content-Type: text/plain; charset="UTF-8" Hi ChangAo, Thank you for the detailed review. For issue 1, my fix removes the IS NOT NULL guard from the pre-check so that *= can detect all duplicate rows, including those containing NULLs. (Note: The semantics of *= has always treated NULL as equal to NULL.) The reasoning is straightforward: the JOIN uses *= to match newdata rows against MV rows. If newdata contains two *=-equal rows, both would match the same MV row in the JOIN, producing a wrong diff. The pre-check must therefore use the same *= semantics to catch exactly those duplicates which is what my fix does by removing the IS NOT NULL guard. The IS NOT NULL guard was the bug as it was hiding real duplicates from detection. Your approach leaves the pre-check unchanged and instead replaces *= in the JOIN with record_image_eq_variant (NULL != NULL). I see two concerns: 1. record_image_eq_variant applies NULL != NULL globally to all rows in the JOIN, not just duplicate ones. This means any unchanged row containing any NULL in any column will never match its counterpart during the JOIN, causing a DELETE + INSERT for that row on every refresh even when the data has not changed. The original issue 2 was specifically about nullable indexed columns, your fix extends the performance problem to all nullable columns anywhere in the row, which makes the performance worse than issue 2. 2. The error surfaced becomes a unique_violation from the index rather than the explicit "contains duplicate rows" message, which is harder for users to diagnose. Regards, Surya Poondla --000000000000ae5714064d78799b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi ChangAo,

Thank you for the detailed review.=

For issue 1, my fix removes the IS NOT NULL guard from the pre-chec= k so
that *=3D can detect all duplicate rows, including those containing= NULLs.
(Note: The semantics of *=3D =C2=A0has always treated NULL as eq= ual to NULL.)

The reasoning is straightforward: the JOIN uses *=3D t= o match newdata rows
against MV rows. If newdata contains two *=3D-equal= rows, both would match
the same MV row in the JOIN, producing a wrong d= iff. The pre-check must
therefore use the same *=3D semantics to catch e= xactly those duplicates=C2=A0
which is what my fix does by removing the = IS NOT NULL guard.
The IS NOT NULL guard was the bug as it was hiding re= al duplicates from detection.

Your approach leaves the pre-check unc= hanged and instead replaces *=3D in
the JOIN with record_image_eq_varian= t (NULL !=3D NULL). I see two concerns:
1. record_image_eq_variant appli= es NULL !=3D NULL globally to all rows in
=C2=A0 =C2=A0the JOIN, not jus= t duplicate ones. This means any unchanged row
=C2=A0 =C2=A0containing a= ny NULL in any column will never match its counterpart
=C2=A0 =C2=A0duri= ng the JOIN, causing a DELETE + INSERT for that row on every
=C2=A0 =C2= =A0refresh even when the data has not changed. The original issue 2 was
= =C2=A0 =C2=A0specifically about nullable indexed columns, your fix extends = the
=C2=A0 =C2=A0performance problem to all nullable columns anywhere in= the row,
=C2=A0 =C2=A0which makes the performance worse than issue=C2= =A02.
2. The error surfaced becomes a unique_violation from the index ra= ther
=C2=A0 =C2=A0than the explicit "contains duplicate rows" = message, which is harder
=C2=A0 =C2=A0for users to diagnose.

Rega= rds,
Surya Poondla
--000000000000ae5714064d78799b--