public inbox for pgsql-bugs@postgresql.org  
help / color / mirror / Atom feed
From: surya poondla <suryapoondla4@gmail.com>
To: cca5507 <cca5507@qq.com>
Cc: Rafia Sabih <rafia.pghackers@gmail.com>
Cc: Giuliano Gagliardi <gogi@gogi.tv>
Cc: pgsql-bugs <pgsql-bugs@lists.postgresql.org>
Subject: Re: Two issues with REFRESH MATERIALIZED VIEW CONCURRENTLY
Date: Mon, 18 May 2026 09:48:43 -0700
Message-ID: <CAOVWO5r2nQZgUubJAQCjog91GTpq9PvNtoPMhFWGoYqu1Y45xw@mail.gmail.com> (raw)
In-Reply-To: <CAOVWO5o46gqX_=BkrzuTRyPijDBWaYAGFVqn7UXrww_4owhJ0w@mail.gmail.com>
References: <40d694df-39fd-4a4a-9459-9d6489165f60@gogi.tv>
	<CAOVWO5p7PieGTm3GJC8NAYgSEQLoBdZw9Se4u=EbPrr1mW7d5Q@mail.gmail.com>
	<CAOVWO5rWkGXx37tpsAuTwrY5muPePHG44ke0xwYcakWVfhkPaw@mail.gmail.com>
	<CAOVWO5qchazoL8H+-d4eZ_zm8jFUH4YF3WvWnXqcDEh8i7hbhw@mail.gmail.com>
	<CAOVWO5qF2_FV7M=USdSgjjPO124YMpw7TY+M4+mnkNPOA2Bstg@mail.gmail.com>
	<CA+FpmFe7pgJRi16zP8Liz5vuUkVdTnPy30Zr=mXzZA3shKgvCw@mail.gmail.com>
	<CAOVWO5rfJGpxRDn496gAHmAxW7L_72==KkOcz0q16YOvqm4-=g@mail.gmail.com>
	<CAOVWO5p2U8rMvh3pJ4WaNN7ES4SbiEKXLV_0pZ-UdYp0kACa7g@mail.gmail.com>
	<CA+FpmFeLJDWZSx1TSRVodFbQOLWa-SXBCiG0rspu--HtadaJ+w@mail.gmail.com>
	<CA+FpmFdCtzBO56sAg9hdJ79gALNTfBDc0OxLah2wLAimeORt0A@mail.gmail.com>
	<CAOVWO5ouwJTgGsXguP=pZn6wB2fL+xErK9+pdg98vCApq7F10g@mail.gmail.com>
	<tencent_175CBC8C83733BB7D09242AFDAD814749907@qq.com>
	<tencent_7FA4F0B1D63BE55A8608EFF600CD4E371405@qq.com>
	<CAOVWO5pQ4jyR1M6XYkjkg7c1KshrzaG-9mng7AGBGghZpYKR-Q@mail.gmail.com>
	<tencent_2DDB9D84961482D40AA55CC88CE0E51D740A@qq.com>
	<CAOVWO5o46gqX_=BkrzuTRyPijDBWaYAGFVqn7UXrww_4owhJ0w@mail.gmail.com>

Hi All,

I rebased the patches to the latest code.

Regards,
Surya Poondla

>


Attachments:

  [application/octet-stream] 0006-Fix-REFRESH-MATERIALIZED-VIEW-CONCURRENTLY-to-detect.patch (11.4K, 3-0006-Fix-REFRESH-MATERIALIZED-VIEW-CONCURRENTLY-to-detect.patch)
  download | inline diff:
From 19873f4a02509a6870623140ed5f63a8dd390b90 Mon Sep 17 00:00:00 2001
From: spoondla <s_poondla@apple.com>
Date: Mon, 2 Mar 2026 11:22:11 -0800
Subject: [PATCH v6] Fix REFRESH MATERIALIZED VIEW CONCURRENTLY to detect
 duplicate rows

REFRESH MATERIALIZED VIEW CONCURRENTLY was incorrectly skipping
duplicate detection for rows containing NULL values. The pre-check
query used "WHERE newdata.* IS NOT NULL" which caused rows with any
NULL column to bypass duplicate detection entirely.

The fix rebuilds the pre-check using the same per-column equality
operators used by the diff JOIN (one condition per unique-indexed
column).  These operators treat NULL as not equal to NULL, matching
unique index semantics where NULLs are considered distinct.  Only
rows whose indexed columns are non-null and equal are flagged as
duplicates --- the same rows that would cause join ambiguity.

This correctly handles two cases that the old approach got wrong:
  - (test, NULL) x2 with index on a: a='test' is non-null and
    duplicated, so the duplicate is correctly detected and an error
    is raised.
  - (NULL, NULL) x2 with index on a: a=NULL, and unique indexes
    allow multiple NULLs (each is treated as distinct), so the
    refresh correctly succeeds and updates the view to two rows.

Added regression tests covering both cases.
---
 src/backend/commands/matview.c        | 122 ++++++++++++++++++--------
 src/test/regress/expected/matview.out |  33 ++++++-
 src/test/regress/sql/matview.sql      |  22 +++++
 3 files changed, 137 insertions(+), 40 deletions(-)

diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index f7d8007f796..b806ab46647 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -606,6 +606,8 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner,
 	ListCell   *indexoidscan;
 	int16		relnatts;
 	Oid		   *opUsedForQual;
+	StringInfoData precheck_cond_buf;
+	bool		precheck_has_cond;
 
 	initStringInfo(&querybuf);
 	matviewRel = table_open(matviewOid, NoLock);
@@ -634,45 +636,6 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner,
 	if (SPI_exec(querybuf.data, 0) != SPI_OK_UTILITY)
 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
 
-	/*
-	 * We need to ensure that there are not duplicate rows without NULLs in
-	 * the new data set before we can count on the "diff" results.  Check for
-	 * that in a way that allows showing the first duplicated row found.  Even
-	 * after we pass this test, a unique index on the materialized view may
-	 * find a duplicate key problem.
-	 *
-	 * Note: here and below, we use "tablename.*::tablerowtype" as a hack to
-	 * keep ".*" from being expanded into multiple columns in a SELECT list.
-	 * Compare ruleutils.c's get_variable().
-	 */
-	resetStringInfo(&querybuf);
-	appendStringInfo(&querybuf,
-					 "SELECT newdata.*::%s FROM %s newdata "
-					 "WHERE newdata.* IS NOT NULL AND EXISTS "
-					 "(SELECT 1 FROM %s newdata2 WHERE newdata2.* IS NOT NULL "
-					 "AND newdata2.* OPERATOR(pg_catalog.*=) newdata.* "
-					 "AND newdata2.ctid OPERATOR(pg_catalog.<>) "
-					 "newdata.ctid)",
-					 tempname, tempname, tempname);
-	if (SPI_execute(querybuf.data, false, 1) != SPI_OK_SELECT)
-		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
-	if (SPI_processed > 0)
-	{
-		/*
-		 * Note that this ereport() is returning data to the user.  Generally,
-		 * we would want to make sure that the user has been granted access to
-		 * this data.  However, REFRESH MAT VIEW is only able to be run by the
-		 * owner of the mat view (or a superuser) and therefore there is no
-		 * need to check for access to data in the mat view.
-		 */
-		ereport(ERROR,
-				(errcode(ERRCODE_CARDINALITY_VIOLATION),
-				 errmsg("new data for materialized view \"%s\" contains duplicate rows without any null columns",
-						RelationGetRelationName(matviewRel)),
-				 errdetail("Row: %s",
-						   SPI_getvalue(SPI_tuptable->vals[0], SPI_tuptable->tupdesc, 1))));
-	}
-
 	/*
 	 * Create the temporary "diff" table.
 	 *
@@ -715,6 +678,8 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner,
 	tupdesc = matviewRel->rd_att;
 	opUsedForQual = palloc0_array(Oid, relnatts);
 	foundUniqueIndex = false;
+	initStringInfo(&precheck_cond_buf);
+	precheck_has_cond = false;
 
 	indexoidlist = RelationGetIndexList(matviewRel);
 
@@ -802,6 +767,30 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner,
 										 rightop, attrtype);
 
 				foundUniqueIndex = true;
+
+				/*
+				 * Also accumulate the same condition for the duplicate-row
+				 * pre-check, comparing newdata2 against newdata (instead of
+				 * newdata against mv).  This lets us detect rows that are
+				 * duplicates with respect to the unique index semantics ---
+				 * i.e. rows whose indexed columns are non-null and equal ---
+				 * without falsely flagging rows whose indexed columns are
+				 * NULL (which unique indexes treat as distinct).
+				 */
+				if (precheck_has_cond)
+					appendStringInfoString(&precheck_cond_buf, " AND ");
+
+				leftop = quote_qualified_identifier("newdata2",
+														NameStr(attr->attname));
+				rightop = quote_qualified_identifier("newdata",
+														 NameStr(attr->attname));
+
+				generate_operator_clause(&precheck_cond_buf,
+										 leftop, attrtype,
+										 op,
+										 rightop, attrtype);
+
+				precheck_has_cond = true;
 			}
 		}
 
@@ -831,6 +820,61 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner,
 						   "WHERE newdata.* IS NULL OR mv.* IS NULL "
 						   "ORDER BY tid");
 
+	/*
+	 * Before populating the diff table, check for duplicate rows in the
+	 * new data set.  We look for rows that are equal in all unique index
+	 * columns (using the same operators as the join above, where
+	 * NULL = NULL is false) AND are also *=-equal overall.  Such rows
+	 * would both match the same materialized view row in the join,
+	 * producing an ambiguous diff.
+	 *
+	 * Using index column operators rather than *= alone is important: it
+	 * correctly excludes rows whose indexed columns are NULL, because
+	 * unique indexes treat NULLs as distinct so those rows do not cause
+	 * join ambiguity.
+	 *
+	 * Note: here and below, we use "tablename.*::tablerowtype" as a hack
+	 * to keep ".*" from being expanded into multiple columns in a SELECT
+	 * list.  Compare ruleutils.c's get_variable().
+	 *
+	 * Even after we pass this test, a unique index on the materialized
+	 * view may find a duplicate key problem.
+	 */
+	{
+		StringInfoData precheck_querybuf;
+
+		initStringInfo(&precheck_querybuf);
+		appendStringInfo(&precheck_querybuf,
+						 "SELECT newdata.*::%s FROM %s newdata "
+						 "WHERE EXISTS "
+						 "(SELECT 1 FROM %s newdata2 WHERE %s"
+						 " AND newdata2.* OPERATOR(pg_catalog.*=) newdata.*"
+						 " AND newdata2.ctid OPERATOR(pg_catalog.<>) newdata.ctid)",
+						 tempname, tempname, tempname, precheck_cond_buf.data);
+
+		if (SPI_execute(precheck_querybuf.data, false, 1) != SPI_OK_SELECT)
+			elog(ERROR, "SPI_exec failed: %s", precheck_querybuf.data);
+
+		if (SPI_processed > 0)
+		{
+			/*
+			 * Note that this ereport() is returning data to the user.
+			 * Generally, we would want to make sure that the user has been
+			 * granted access to this data.  However, REFRESH MAT VIEW is
+			 * only able to be run by the owner of the mat view (or a
+			 * superuser) and therefore there is no need to check for access
+			 * to data in the mat view.
+			 */
+			ereport(ERROR,
+					(errcode(ERRCODE_CARDINALITY_VIOLATION),
+					 errmsg("new data for materialized view \"%s\" contains duplicate rows",
+							RelationGetRelationName(matviewRel)),
+					 errdetail("Row: %s",
+							   SPI_getvalue(SPI_tuptable->vals[0],
+										   SPI_tuptable->tupdesc, 1))));
+		}
+	}
+
 	/* Populate the temporary "diff" table. */
 	if (SPI_exec(querybuf.data, 0) != SPI_OK_INSERT)
 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
diff --git a/src/test/regress/expected/matview.out b/src/test/regress/expected/matview.out
index 0355720dfc6..1dcff26a2e4 100644
--- a/src/test/regress/expected/matview.out
+++ b/src/test/regress/expected/matview.out
@@ -396,8 +396,39 @@ REFRESH MATERIALIZED VIEW mvtest_mv;
 ERROR:  could not create unique index "mvtest_mv_a_idx"
 DETAIL:  Key (a)=(1) is duplicated.
 REFRESH MATERIALIZED VIEW CONCURRENTLY mvtest_mv;
-ERROR:  new data for materialized view "mvtest_mv" contains duplicate rows without any null columns
+ERROR:  new data for materialized view "mvtest_mv" contains duplicate rows
 DETAIL:  Row: (1,10)
+DROP TABLE mvtest_foo CASCADE;
+NOTICE:  drop cascades to materialized view mvtest_mv
+-- test that duplicate rows containing NULLs in non-indexed columns are detected
+CREATE TABLE mvtest_foo(a text, b text);
+INSERT INTO mvtest_foo VALUES('test', NULL);
+CREATE MATERIALIZED VIEW mvtest_mv AS SELECT * FROM mvtest_foo;
+CREATE UNIQUE INDEX ON mvtest_mv(a);
+INSERT INTO mvtest_foo VALUES('test', NULL);
+REFRESH MATERIALIZED VIEW mvtest_mv;
+ERROR:  could not create unique index "mvtest_mv_a_idx"
+DETAIL:  Key (a)=(test) is duplicated.
+REFRESH MATERIALIZED VIEW CONCURRENTLY mvtest_mv;
+ERROR:  new data for materialized view "mvtest_mv" contains duplicate rows
+DETAIL:  Row: (test,)
+DROP TABLE mvtest_foo CASCADE;
+NOTICE:  drop cascades to materialized view mvtest_mv
+-- test that rows with NULLs in the indexed column are not false positives:
+-- unique indexes treat NULLs as distinct, so (NULL,NULL)x2 is a valid state
+-- and CONCURRENTLY should succeed and update the view to reflect both rows
+CREATE TABLE mvtest_foo(a int, b int);
+INSERT INTO mvtest_foo VALUES(NULL, NULL);
+CREATE MATERIALIZED VIEW mvtest_mv AS SELECT * FROM mvtest_foo;
+CREATE UNIQUE INDEX ON mvtest_mv(a);
+INSERT INTO mvtest_foo VALUES(NULL, NULL);
+REFRESH MATERIALIZED VIEW CONCURRENTLY mvtest_mv;
+SELECT COUNT(*) FROM mvtest_mv;
+ count 
+-------
+     2
+(1 row)
+
 DROP TABLE mvtest_foo CASCADE;
 NOTICE:  drop cascades to materialized view mvtest_mv
 -- make sure that all columns covered by unique indexes works
diff --git a/src/test/regress/sql/matview.sql b/src/test/regress/sql/matview.sql
index 934426b9ae8..e65cd4fa94b 100644
--- a/src/test/regress/sql/matview.sql
+++ b/src/test/regress/sql/matview.sql
@@ -135,6 +135,28 @@ REFRESH MATERIALIZED VIEW mvtest_mv;
 REFRESH MATERIALIZED VIEW CONCURRENTLY mvtest_mv;
 DROP TABLE mvtest_foo CASCADE;
 
+-- test that duplicate rows containing NULLs in non-indexed columns are detected
+CREATE TABLE mvtest_foo(a text, b text);
+INSERT INTO mvtest_foo VALUES('test', NULL);
+CREATE MATERIALIZED VIEW mvtest_mv AS SELECT * FROM mvtest_foo;
+CREATE UNIQUE INDEX ON mvtest_mv(a);
+INSERT INTO mvtest_foo VALUES('test', NULL);
+REFRESH MATERIALIZED VIEW mvtest_mv;
+REFRESH MATERIALIZED VIEW CONCURRENTLY mvtest_mv;
+DROP TABLE mvtest_foo CASCADE;
+
+-- test that rows with NULLs in the indexed column are not false positives:
+-- unique indexes treat NULLs as distinct, so (NULL,NULL)x2 is a valid state
+-- and CONCURRENTLY should succeed and update the view to reflect both rows
+CREATE TABLE mvtest_foo(a int, b int);
+INSERT INTO mvtest_foo VALUES(NULL, NULL);
+CREATE MATERIALIZED VIEW mvtest_mv AS SELECT * FROM mvtest_foo;
+CREATE UNIQUE INDEX ON mvtest_mv(a);
+INSERT INTO mvtest_foo VALUES(NULL, NULL);
+REFRESH MATERIALIZED VIEW CONCURRENTLY mvtest_mv;
+SELECT COUNT(*) FROM mvtest_mv;
+DROP TABLE mvtest_foo CASCADE;
+
 -- make sure that all columns covered by unique indexes works
 CREATE TABLE mvtest_foo(a, b, c) AS VALUES(1, 2, 3);
 CREATE MATERIALIZED VIEW mvtest_mv AS SELECT * FROM mvtest_foo;
-- 
2.39.5 (Apple Git-154)



  [application/octet-stream] 0006-Fix-REFRESH-MATERIALIZED-VIEW-CONCURRENTLY-performan.patch (10.0K, 4-0006-Fix-REFRESH-MATERIALIZED-VIEW-CONCURRENTLY-performan.patch)
  download | inline diff:
From 662da36edbe9b190d78478c4fe128a93728f952f Mon Sep 17 00:00:00 2001
From: spoondla <s_poondla@apple.com>
Date: Mon, 2 Mar 2026 11:23:17 -0800
Subject: [PATCH v6] Fix REFRESH MATERIALIZED VIEW CONCURRENTLY performance
 with nullable indexed columns

When a materialized view has a unique index on a nullable column, the join
condition used to detect changes included NULL = NULL comparisons which
evaluate to NULL (false), causing every row with a NULL indexed value to
appear as changed on every CONCURRENTLY refresh even when the data was
unchanged.

The fix skips nullable columns from the per-column join conditions when the
index has at least one NOT NULL column.  The NOT NULL column(s) are sufficient
to identify matching rows uniquely, so omitting the nullable ones avoids the
unnecessary NULL = NULL comparisons.  The record equality operator (*=),
appended to the join condition unconditionally, handles all columns including
nullable ones and ensures changed rows are still detected correctly.

When all columns of a unique index are nullable, no columns are skipped.
Relying on *= alone in that case would be incorrect: *= treats NULL as equal
to NULL, which can cause join ambiguity or silent data loss when rows contain
all-null values.  Falling back to including all columns preserves the original
(slower but correct) behavior for that edge case.

Added regression tests covering both the optimization path (index with at
least one NOT NULL column) and the fallback path (all-nullable index).
---
 src/backend/commands/matview.c        | 60 +++++++++++++++++++++++++--
 src/test/regress/expected/matview.out | 29 +++++++++++++
 src/test/regress/sql/matview.sql      | 24 +++++++++++
 3 files changed, 110 insertions(+), 3 deletions(-)

diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index f7d8007f796..a47504992d6 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -602,6 +602,7 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner,
 	char	   *nsp;
 	TupleDesc	tupdesc;
 	bool		foundUniqueIndex;
+	bool		addedAnyQuals;
 	List	   *indexoidlist;
 	ListCell   *indexoidscan;
 	int16		relnatts;
@@ -715,6 +716,7 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner,
 	tupdesc = matviewRel->rd_att;
 	opUsedForQual = palloc0_array(Oid, relnatts);
 	foundUniqueIndex = false;
+	addedAnyQuals = false;
 
 	indexoidlist = RelationGetIndexList(matviewRel);
 
@@ -731,6 +733,9 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner,
 			oidvector  *indclass;
 			Datum		indclassDatum;
 			int			i;
+			bool		index_has_nonnull;
+
+			foundUniqueIndex = true;
 
 			/* Must get indclass the hard way. */
 			indclassDatum = SysCacheGetAttrNotNull(INDEXRELID,
@@ -738,6 +743,29 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner,
 												   Anum_pg_index_indclass);
 			indclass = (oidvector *) DatumGetPointer(indclassDatum);
 
+			/*
+			 * Pre-scan: check whether this index has at least one NOT NULL
+			 * column.  If it does, we can safely skip nullable columns in
+			 * the join condition below (the NOT NULL column(s) uniquely
+			 * identify rows, so omitting nullable columns does not introduce
+			 * join ambiguity).  If all columns are nullable we must include
+			 * all of them to preserve correctness: skipping everything and
+			 * relying on *= alone can cause join ambiguity or data loss
+			 * because *= treats NULL as equal to NULL.
+			 */
+			index_has_nonnull = false;
+			for (i = 0; i < indnkeyatts; i++)
+			{
+				Form_pg_attribute attr_check =
+					TupleDescAttr(tupdesc, indexStruct->indkey.values[i] - 1);
+
+				if (attr_check->attnotnull)
+				{
+					index_has_nonnull = true;
+					break;
+				}
+			}
+
 			/* Add quals for all columns from this index. */
 			for (i = 0; i < indnkeyatts; i++)
 			{
@@ -753,6 +781,24 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner,
 				const char *leftop;
 				const char *rightop;
 
+				/*
+				 * Skip nullable columns when this index has at least one NOT
+				 * NULL column.  Nullable columns in unique indexes allow
+				 * multiple NULLs and NULL = NULL evaluates to NULL (false) in
+				 * the join, making unchanged NULL-containing rows appear as
+				 * changed on every refresh.  The NOT NULL column(s) are
+				 * sufficient to identify matching rows, so we can safely omit
+				 * nullable ones without introducing join ambiguity.
+				 *
+				 * When all columns are nullable we must not skip any of them:
+				 * with no per-column conditions only *= would remain, which
+				 * can cause join ambiguity or data loss (see pre-scan comment
+				 * above).  We fall back to including all columns so the join
+				 * behaves as in the original code.
+				 */
+				if (index_has_nonnull && !attr->attnotnull)
+					continue;
+
 				/*
 				 * Identify the equality operator associated with this index
 				 * column.  First we need to look up the column's opclass.
@@ -788,7 +834,7 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner,
 				/*
 				 * Actually add the qual, ANDed with any others.
 				 */
-				if (foundUniqueIndex)
+				if (addedAnyQuals)
 					appendStringInfoString(&querybuf, " AND ");
 
 				leftop = quote_qualified_identifier("newdata",
@@ -801,7 +847,7 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner,
 										 op,
 										 rightop, attrtype);
 
-				foundUniqueIndex = true;
+				addedAnyQuals = true;
 			}
 		}
 
@@ -826,8 +872,16 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner,
 				errmsg("could not find suitable unique index on materialized view \"%s\"",
 					   RelationGetRelationName(matviewRel)));
 
+	/*
+	 * The record equality operator (*=) is always included in the join
+	 * predicate.  It handles all columns correctly including nullable ones
+	 * that were skipped above, since *=  treats NULL as equal to NULL.
+	 */
+	if (addedAnyQuals)
+		appendStringInfoString(&querybuf, " AND ");
+
 	appendStringInfoString(&querybuf,
-						   " AND newdata.* OPERATOR(pg_catalog.*=) mv.*) "
+						   "newdata.* OPERATOR(pg_catalog.*=) mv.*) "
 						   "WHERE newdata.* IS NULL OR mv.* IS NULL "
 						   "ORDER BY tid");
 
diff --git a/src/test/regress/expected/matview.out b/src/test/regress/expected/matview.out
index 0355720dfc6..98db1e08185 100644
--- a/src/test/regress/expected/matview.out
+++ b/src/test/regress/expected/matview.out
@@ -418,6 +418,35 @@ CREATE MATERIALIZED VIEW mvtest_mv2 AS SELECT * FROM mvtest_mv1
   WHERE col1 = (SELECT LEAST(col1) FROM mvtest_mv1) WITH NO DATA;
 DROP MATERIALIZED VIEW mvtest_mv1 CASCADE;
 NOTICE:  drop cascades to materialized view mvtest_mv2
+-- test that nullable indexed columns are skipped in the join condition when
+-- the index has at least one NOT NULL column.  The NOT NULL column alone
+-- identifies matching rows, so the nullable column does not need to be in
+-- the join and its NULL values do not cause unnecessary DELETE+INSERT churn.
+CREATE TABLE mvtest_foo(a int NOT NULL, b int);
+INSERT INTO mvtest_foo VALUES(1, NULL), (2, NULL), (3, NULL);
+CREATE MATERIALIZED VIEW mvtest_mv AS SELECT * FROM mvtest_foo;
+CREATE UNIQUE INDEX ON mvtest_mv(a);  -- a is NOT NULL, b is nullable
+-- Add a second index on the nullable column alone to make it exercise the
+-- all-nullable fallback path through the pre-scan.
+CREATE UNIQUE INDEX ON mvtest_mv(b);
+REFRESH MATERIALIZED VIEW CONCURRENTLY mvtest_mv;  -- must succeed
+DROP TABLE mvtest_foo CASCADE;
+NOTICE:  drop cascades to materialized view mvtest_mv
+-- test that CONCURRENTLY still works when all indexed columns are nullable
+-- (falls back to including all columns, which is slower but correct).
+CREATE TABLE mvtest_foo(a int, b int);
+INSERT INTO mvtest_foo VALUES(NULL, 1);
+CREATE MATERIALIZED VIEW mvtest_mv AS SELECT * FROM mvtest_foo;
+CREATE UNIQUE INDEX ON mvtest_mv(a);  -- all-nullable index
+REFRESH MATERIALIZED VIEW CONCURRENTLY mvtest_mv;  -- must succeed
+SELECT * FROM mvtest_mv;
+ a | b 
+---+---
+   | 1
+(1 row)
+
+DROP TABLE mvtest_foo CASCADE;
+NOTICE:  drop cascades to materialized view mvtest_mv
 -- make sure that types with unusual equality tests work
 CREATE TABLE mvtest_boxes (id serial primary key, b box);
 INSERT INTO mvtest_boxes (b) VALUES
diff --git a/src/test/regress/sql/matview.sql b/src/test/regress/sql/matview.sql
index 934426b9ae8..624056a4dc5 100644
--- a/src/test/regress/sql/matview.sql
+++ b/src/test/regress/sql/matview.sql
@@ -153,6 +153,30 @@ CREATE MATERIALIZED VIEW mvtest_mv2 AS SELECT * FROM mvtest_mv1
   WHERE col1 = (SELECT LEAST(col1) FROM mvtest_mv1) WITH NO DATA;
 DROP MATERIALIZED VIEW mvtest_mv1 CASCADE;
 
+-- test that nullable indexed columns are skipped in the join condition when
+-- the index has at least one NOT NULL column.  The NOT NULL column alone
+-- identifies matching rows, so the nullable column does not need to be in
+-- the join and its NULL values do not cause unnecessary DELETE+INSERT churn.
+CREATE TABLE mvtest_foo(a int NOT NULL, b int);
+INSERT INTO mvtest_foo VALUES(1, NULL), (2, NULL), (3, NULL);
+CREATE MATERIALIZED VIEW mvtest_mv AS SELECT * FROM mvtest_foo;
+CREATE UNIQUE INDEX ON mvtest_mv(a);  -- a is NOT NULL, b is nullable
+-- Add a second index on the nullable column alone to make it exercise the
+-- all-nullable fallback path through the pre-scan.
+CREATE UNIQUE INDEX ON mvtest_mv(b);
+REFRESH MATERIALIZED VIEW CONCURRENTLY mvtest_mv;  -- must succeed
+DROP TABLE mvtest_foo CASCADE;
+
+-- test that CONCURRENTLY still works when all indexed columns are nullable
+-- (falls back to including all columns, which is slower but correct).
+CREATE TABLE mvtest_foo(a int, b int);
+INSERT INTO mvtest_foo VALUES(NULL, 1);
+CREATE MATERIALIZED VIEW mvtest_mv AS SELECT * FROM mvtest_foo;
+CREATE UNIQUE INDEX ON mvtest_mv(a);  -- all-nullable index
+REFRESH MATERIALIZED VIEW CONCURRENTLY mvtest_mv;  -- must succeed
+SELECT * FROM mvtest_mv;
+DROP TABLE mvtest_foo CASCADE;
+
 -- make sure that types with unusual equality tests work
 CREATE TABLE mvtest_boxes (id serial primary key, b box);
 INSERT INTO mvtest_boxes (b) VALUES
-- 
2.39.5 (Apple Git-154)



reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: pgsql-bugs@postgresql.org
  Cc: suryapoondla4@gmail.com, cca5507@qq.com, rafia.pghackers@gmail.com, gogi@gogi.tv, pgsql-bugs@lists.postgresql.org
  Subject: Re: Two issues with REFRESH MATERIALIZED VIEW CONCURRENTLY
  In-Reply-To: <CAOVWO5r2nQZgUubJAQCjog91GTpq9PvNtoPMhFWGoYqu1Y45xw@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox