[PATCH] Improving index selection for logical replication apply with replica identity full

public inbox for pgsql-hackers@postgresql.org  
help / color / mirror / Atom feed

From: Ethan Mertz <ethan.mertz@gmail.com>
To: pgsql-hackers@postgresql.org
Cc: kuroda.hayato@fujitsu.com <kuroda.hayato@fujitsu.com>
Cc: onderkalaci@gmail.com <onderkalaci@gmail.com>
Subject: [PATCH] Improving index selection for logical replication apply with replica identity full
Date: Fri, 22 May 2026 10:18:12 -0700
Message-ID: <CAA9pdKdMVBrcwORyLs_E-sw4zGzwumGfDn43Kj=FKWY9cASyJg@mail.gmail.com> (raw)

Hello hackers,


I'd like to reopen the discussion on index selection for logical
replication apply for replica identity full. Since PostgreSQL 14, replica
identity full is able to make use of existing indexes [1][2] (authors in
CC) when replicating UPDATE or DELETE operations.

Today, when identifying which index to use for the update or delete, the
first suitable index is chosen by OID order, which generally corresponds to
creation order. If the chosen index has low cardinality, the lookup may
perform no better than a sequential scan. While avoiding replica identity
full is generally recommended, some users need to maintain REPLICA IDENTITY
FULL to support downstream logical consumers that require full row images.
These users would also like performant PostgreSQL to PostgreSQL replication.

I propose improving the index selection heuristic to prefer unique indexes,
favoring those with fewer columns. Previous discussion in the linked
threads avoided invoking the planner for full index selection; the
heuristic I propose serves as a middle ground. A unique index guarantees
that each tuple match requires at most one index scan, and among unique
indexes, fewer columns means a narrower, more efficient lookup. I have
attached a patch implementing this check.


In addition, I've performed some simple performance testing of this patch:

```
CREATE TABLE cardinality_index_test (
    id          uuid NOT NULL DEFAULT gen_random_uuid(),
    is_active   boolean NOT NULL DEFAULT true,
    payload     text NOT NULL DEFAULT repeat('x', 200)
);

CREATE INDEX idx_bad_low_cardinality ON cardinality_index_test (is_active);

CREATE UNIQUE INDEX idx_good_unique ON cardinality_index_test (id);
```

With the patch, the time to replicate 1,000 point updates on a table size
of 1,000,000 dropped from 53 seconds to under 1 second.

Feedback welcome.

Thanks,

Ethan Mertz
SDE, Amazon Web Services

[1]
https://www.postgresql.org/message-id/flat/CACawEhVLqmAAyPXdHEPv1ssU2c%3DdqOniiGz7G73HfyS7%2BnGV4w%4...

[2]
https://www.postgresql.org/message-id/flat/TYAPR01MB58669D7414E59664E17A5827F522A%40TYAPR01MB5866.jp...


Attachments:

  [application/x-patch] v1-0001-Improve-index-selection-for-REPLICA-IDENTITY-FULL.patch (6.3K, 3-v1-0001-Improve-index-selection-for-REPLICA-IDENTITY-FULL.patch)
  download | inline diff:
From 3bfacc0428307f18f753bab11cf1400781caf09e Mon Sep 17 00:00:00 2001
From: Ethan Mertz <ethmertz@amazon.com>
Date: Thu, 21 May 2026 22:41:13 +0000
Subject: [PATCH v1] Improve index selection for REPLICA IDENTITY FULL

When multiple usable indexes exist for a relation with REPLICA IDENTITY
FULL, the subscriber now prefers unique indexes over non-unique ones
(since a unique index guarantees at most one tuple per index scan), and
among indexes of the same uniqueness, prefers those with fewer key
columns for a narrower, more efficient lookup.

Previously, the first eligible index found was returned without
considering whether a better candidate existed.  This could lead to
suboptimal index usage during logical replication apply, particularly
when both unique and non-unique indexes were available.

Regression tests are included to verify the selection logic.
---
 src/backend/replication/logical/relation.c    | 34 ++++++++--
 .../subscription/t/032_subscribe_use_index.pl | 64 +++++++++++++++++++
 2 files changed, 94 insertions(+), 4 deletions(-)

diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index 0b1d80b5b0f..2334e0c049f 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -784,28 +784,54 @@ logicalrep_partition_open(LogicalRepRelMapEntry *root,
  * We expect to call this function when REPLICA IDENTITY FULL is defined for
  * the remote relation.
  *
+ * If multiple usable indexes exist, preference is given to unique indexes
+ * (since they guarantee at most one tuple per index scan), and among indexes
+ * of the same uniqueness, those with fewer key columns are preferred for a
+ * narrower, more efficient lookup.
+ *
  * If no suitable index is found, returns InvalidOid.
  */
 static Oid
 FindUsableIndexForReplicaIdentityFull(Relation localrel, AttrMap *attrmap)
 {
 	List	   *idxlist = RelationGetIndexList(localrel);
+	Oid			bestIdx = InvalidOid;
+	bool		bestIsUnique = false;
+	int			bestNKeyAtts = PG_INT32_MAX;
 
 	foreach_oid(idxoid, idxlist)
 	{
 		bool		isUsableIdx;
+		bool		isUnique;
+		int			nKeyAtts;
 		Relation	idxRel;
 
 		idxRel = index_open(idxoid, AccessShareLock);
 		isUsableIdx = IsIndexUsableForReplicaIdentityFull(idxRel, attrmap);
+		isUnique = idxRel->rd_index->indisunique;
+		nKeyAtts = idxRel->rd_index->indnkeyatts;
 		index_close(idxRel, AccessShareLock);
 
-		/* Return the first eligible index found */
-		if (isUsableIdx)
-			return idxoid;
+		if (!isUsableIdx)
+			continue;
+
+		/*
+		 * Prefer unique indexes over non-unique ones, since a unique index
+		 * guarantees at most one index scan per tuple match.  Among indexes
+		 * of the same uniqueness, prefer fewer key columns for a narrower,
+		 * more efficient lookup.
+		 */
+		if (bestIdx == InvalidOid ||
+			(isUnique && !bestIsUnique) ||
+			(isUnique == bestIsUnique && nKeyAtts < bestNKeyAtts))
+		{
+			bestIdx = idxoid;
+			bestIsUnique = isUnique;
+			bestNKeyAtts = nKeyAtts;
+		}
 	}
 
-	return InvalidOid;
+	return bestIdx;
 }
 
 /*
diff --git a/src/test/subscription/t/032_subscribe_use_index.pl b/src/test/subscription/t/032_subscribe_use_index.pl
index c755c1a7518..3637b0e9aca 100644
--- a/src/test/subscription/t/032_subscribe_use_index.pl
+++ b/src/test/subscription/t/032_subscribe_use_index.pl
@@ -547,6 +547,70 @@ $node_subscriber->safe_psql('postgres', "DROP TABLE test_replica_id_full");
 # Testcase end: Subscription can use hash index
 # =============================================================================
 
+# =============================================================================
+# Testcase start: Index selection prefers unique indexes and fewer key columns
+#
+
+# create tables pub and sub
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE test_idx_select (a int, b int, c int)");
+$node_publisher->safe_psql('postgres',
+	"ALTER TABLE test_idx_select REPLICA IDENTITY FULL");
+$node_subscriber->safe_psql('postgres',
+	"CREATE TABLE test_idx_select (a int, b int, c int)");
+
+# create a non-unique index on (a, b, c) and a unique index on (a, b)
+# the unique index with fewer columns should be preferred
+$node_subscriber->safe_psql('postgres',
+	"CREATE INDEX test_idx_select_nonuniq ON test_idx_select(a, b, c)");
+$node_subscriber->safe_psql('postgres',
+	"CREATE UNIQUE INDEX test_idx_select_uniq ON test_idx_select(a, b)");
+
+# create pub/sub
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION tap_pub_idx_select FOR TABLE test_idx_select");
+$node_subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION tap_sub_idx_select CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_idx_select"
+);
+
+# wait for initial table synchronization to finish
+$node_subscriber->wait_for_subscription_sync($node_publisher, $appname);
+
+# insert and update a row
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO test_idx_select VALUES (1, 2, 3)");
+$node_publisher->safe_psql('postgres',
+	"UPDATE test_idx_select SET c = 4 WHERE a = 1");
+
+# wait for catchup and verify the unique index was used
+$node_publisher->wait_for_catchup($appname);
+$node_subscriber->poll_query_until('postgres',
+	q{select (idx_scan = 1) from pg_stat_all_indexes where indexrelname = 'test_idx_select_uniq';}
+  )
+  or die
+  "Timed out while waiting for unique index to be used";
+
+# verify the non-unique index was NOT used
+$result = $node_subscriber->safe_psql('postgres',
+	"select idx_scan from pg_stat_all_indexes where indexrelname = 'test_idx_select_nonuniq'");
+is($result, qq(0),
+	'non-unique index should not be used when unique index is available');
+
+# verify correct data
+$result = $node_subscriber->safe_psql('postgres',
+	"select c from test_idx_select where a = 1");
+is($result, qq(4),
+	'subscriber has correct data after update via preferred unique index');
+
+# cleanup
+$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_idx_select");
+$node_publisher->safe_psql('postgres', "DROP TABLE test_idx_select");
+$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_idx_select");
+$node_subscriber->safe_psql('postgres', "DROP TABLE test_idx_select");
+
+# Testcase end: Index selection prefers unique indexes and fewer key columns
+# =============================================================================
+
 $node_subscriber->stop('fast');
 $node_publisher->stop('fast');
 
-- 
2.47.3

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: pgsql-hackers@postgresql.org
  Cc: ethan.mertz@gmail.com, kuroda.hayato@fujitsu.com, onderkalaci@gmail.com
  Subject: Re: [PATCH] Improving index selection for logical replication apply with replica identity full
  In-Reply-To: <CAA9pdKdMVBrcwORyLs_E-sw4zGzwumGfDn43Kj=FKWY9cASyJg@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox