public inbox for pgsql-docs@postgresql.org
help / color / mirror / Atom feedFrom: Robert Treat <rob@xzilla.net>
To: Bruce Momjian <bruce@momjian.us>
To: Paul A Jungwirth <pj@illuminatedcomputing.com>
To: Laurenz Albe <laurenz.albe@cybertec.at>
Cc: pgsql-docs@lists.postgresql.org
Subject: Re: Streaming Replication vs Logical
Date: Mon, 24 Nov 2025 23:28:22 -0500
Message-ID: <CAJSLCQ2C8U2E=xK2fnmpcX2uLw1fAR_p7ef+4YdoaBmUf1GDVQ@mail.gmail.com> (raw)
In-Reply-To: <CAJSLCQ1MiUw6S982GuJ+FH6b7=vR68T+RrUMdqYs7Wp+At6E_A@mail.gmail.com>
References: <CA+renyULt3VBS1cRFKUfT2=5dr61xBOZdAZ-CqX3XLGXqY-aTQ@mail.gmail.com>
<2c392993640661b817c5c779f6aaf44c103510bf.camel@cybertec.at>
<Zwqlw6RGs4mCnHz9@momjian.us>
<CAJSLCQ1MiUw6S982GuJ+FH6b7=vR68T+RrUMdqYs7Wp+At6E_A@mail.gmail.com>
On Wed, Feb 19, 2025 at 11:15 PM Robert Treat <rob@xzilla.net> wrote:
> On Wed, Feb 19, 2025 at 10:07 PM Bruce Momjian <bruce@momjian.us> wrote:
>> On Sat, Oct 12, 2024 at 07:01:31AM +0200, Laurenz Albe wrote:
>> > On Fri, 2024-10-11 at 15:53 -0700, Paul A Jungwirth wrote:
>> > > Our docs seem to contrast "streaming replication" to logical, but
>> > > these are not really opposites. Sometimes when they say "streaming"
>> > > they mean "physical".
>> > >
>> > > Probably this is historical: at first physical replication was the
>> > > only kind of streaming we had.
>> > >
>> > > Personally this has caused me a lot of confusion. For example,
>> > > recently when I read "Synchronous replication (see Section 26.2.8) is
>> > > only supported on replication slots used over the streaming
>> > > replication interface," I took it to mean synchronous replication only
>> > > worked for physical replication, not logical.
>> >
>> > What you are saying makes a lot of sense, and improving some of this
>> > is a good thing.
>> >
>> > Our current trminology is a mess. There are some places in the documentation
>> > that speak of physical vs. logical replication, while most places use the
>> > term "streaming replication" for physical replication. I myself consequently
>> > speak of "streaming replication" vs. "logical replication", even though both
>> > stream data. The protocol section of the documentation describes the
>> > "streaming replication protocol" and the "logical streaming replication protocol".
>> >
>> > This is confusing, and I am also sometimes confused in the way you described
>> > above.
>> >
>> > I think the mess is too well established to be really cleaned up. But adding
>> > some clarity is a good thing, so +1.
>>
>
> The attached patch expands on Paul's original patch, further consolidating around the terms "streaming physical replication" and "streaming logical replication" in places where it makes sense. I would note that there are places where "streaming replication" makes sense (when it applies to both types) and potentially when "physical replication" might make sense when we could be talking about either streaming or wal shipping, so I don't think we can completely eliminate that, but hopefully this improves what we have.
>
>>
>> I don't think our current setup is sustainable so I think it does need
>> to be cleaned up. Also, physical/logical replication slots also needs
>> help, I think.
>>
>
> I took a look through some of the replication slot stuff and ISTM that it basically gets the streaming logical/physical replication distinctions correct, and I *think*
> it gets the slot distinctions correct as well, but to the degree there might be some issue there, I think it could be addressed separately.
>
Hey Bruce,
Your recent commit on this topic [1] reminded me of the patch from
earlier this year meant to address some other areas where we are
blurry about using streaming vs physical vs logical replication. I
think (I might possibly still be jet lagged) I have updated the
previous version of that patch against HEAD, attached, and bumping it
up for review.
[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=a5b69e30731fb623715ecf4c8073c0f2d...)
Robert Treat
https://xzilla.net
Attachments:
[application/octet-stream] v3-0001-Clarify-usage-of-the-term-streaming-replication.patch (10.7K, 2-v3-0001-Clarify-usage-of-the-term-streaming-replication.patch)
download | inline diff:
From b32c6f190c3e5bf066f172b42beb3b689402a4fb Mon Sep 17 00:00:00 2001
From: Robert Treat <rob@xzilla.net>
Date: Mon, 24 Nov 2025 21:09:26 -0500
Subject: [PATCH v3] Clarify usage of the term streaming replication.
Content-Type: text/plain; charset="utf-8"
The documentation uses this term where it is meant to refer specifically to
physical replication rather than logical replication, which sets up a false
dichotomy between logical and streaming replication. Original patch by
Paul A. Jungwirth, with additional changes and updating by me.
---
doc/src/sgml/config.sgml | 29 ++++++++++++++-------------
doc/src/sgml/high-availability.sgml | 12 +++++++++--
doc/src/sgml/logical-replication.sgml | 6 +++---
doc/src/sgml/logicaldecoding.sgml | 6 +++---
4 files changed, 31 insertions(+), 22 deletions(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 023b3f03ba9..e9dca34ffee 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2069,7 +2069,7 @@ include_dir 'conf.d'
<para>
Specifies the maximum amount of memory to be used by logical decoding,
before some of the decoded changes are written to local disk. This
- limits the amount of memory used by logical streaming replication
+ limits the amount of memory used by streaming logical replication
connections. It defaults to 64 megabytes (<literal>64MB</literal>).
Since each replication connection only uses a single buffer of this size,
and an installation normally doesn't have many such connections
@@ -3794,7 +3794,7 @@ include_dir 'conf.d'
difference between the two modes, but when set to <literal>always</literal>
the WAL archiver is enabled also during archive recovery or standby
mode. In <literal>always</literal> mode, all files restored from the archive
- or streamed with streaming replication will be archived (again). See
+ or streamed with streaming physical replication will be archived (again). See
<xref linkend="continuous-archiving-in-standby"/> for details.
</para>
<para>
@@ -3900,7 +3900,7 @@ include_dir 'conf.d'
full files. Therefore, it is unwise to use a very short
<varname>archive_timeout</varname> — it will bloat your archive
storage. <varname>archive_timeout</varname> settings of a minute or so are
- usually reasonable. You should consider using streaming replication,
+ usually reasonable. You should consider using streaming physical replication,
instead of archiving, if you want data to be copied off the primary
server more quickly than that.
If this value is specified without units, it is taken as seconds.
@@ -3925,7 +3925,7 @@ include_dir 'conf.d'
<para>
This section describes the settings that apply to recovery in general,
- affecting crash recovery, streaming replication and archive-based
+ affecting crash recovery, streaming physical replication and archive-based
replication.
</para>
@@ -4036,7 +4036,7 @@ include_dir 'conf.d'
<para>
The local shell command to execute to retrieve an archived segment of
the WAL file series. This parameter is required for archive recovery,
- but optional for streaming replication.
+ but optional for streaming physical replication.
Any <literal>%f</literal> in the string is
replaced by the name of the file to retrieve from the archive,
and any <literal>%p</literal> is replaced by the copy destination path name
@@ -4462,15 +4462,16 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
<title>Replication</title>
<para>
- These settings control the behavior of the built-in
- <firstterm>streaming replication</firstterm> feature (see
- <xref linkend="streaming-replication"/>), and the built-in
- <firstterm>logical replication</firstterm> feature (see
+ These settings control the behavior of
+ <firstterm>streaming replication</firstterm>,
+ both <firstterm>physical replication</firstterm>
+ (see <xref linkend="streaming-replication"/>) and
+ <firstterm>logical replication</firstterm> (see
<xref linkend="logical-replication"/>).
</para>
<para>
- For <emphasis>streaming replication</emphasis>, servers will be either a
+ For <emphasis>physical replication</emphasis>, servers will be either a
primary or a standby server. Primaries can send data, while standbys
are always receivers of replicated data. When cascading replication
(see <xref linkend="cascading-replication"/>) is used, standby servers
@@ -4901,7 +4902,7 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
These settings control the behavior of a
<link linkend="standby-server-operation">standby server</link>
that is
- to receive replication data. Their values on the primary server
+ to receive physical replication data. Their values on the primary server
are irrelevant.
</para>
@@ -5041,7 +5042,7 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
conflict with about-to-be-applied WAL entries, as described in
<xref linkend="hot-standby-conflict"/>.
<varname>max_standby_streaming_delay</varname> applies when WAL data is
- being received via streaming replication.
+ being received via streaming physical replication.
If this value is specified without units, it is taken as milliseconds.
The default is 30 seconds.
A value of -1 allows the standby to wait forever for conflicting
@@ -5177,7 +5178,7 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
<listitem>
<para>
Specifies how long the standby server should wait when WAL data is not
- available from any sources (streaming replication,
+ available from any sources (streaming physical replication,
local <filename>pg_wal</filename> or WAL archive) before trying
again to retrieve WAL data.
If this value is specified without units, it is taken as milliseconds.
@@ -5254,7 +5255,7 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
<filename>pg_wal</filename> directory.
</para>
<para>
- This parameter is intended for use with streaming replication deployments;
+ This parameter is intended for use with streaming physical replication deployments;
however, if the parameter is specified it will be honored in all cases
except crash recovery.
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index 81eeadd6c47..33ca3f0286c 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -151,7 +151,7 @@ protocol to make nodes agree on a serializable transactional order.
</para>
<para>
A standby server can be implemented using file-based log shipping
- (<xref linkend="warm-standby"/>) or streaming replication (see
+ (<xref linkend="warm-standby"/>) or streaming physical replication (see
<xref linkend="streaming-replication"/>), or a combination of both. For
information on hot standby, see <xref linkend="hot-standby"/>.
</para>
@@ -628,7 +628,7 @@ protocol to make nodes agree on a serializable transactional order.
In standby mode, the server continuously applies WAL received from the
primary server. The standby server can read WAL from a WAL archive
(see <xref linkend="guc-restore-command"/>) or directly from the primary
- over a TCP connection (streaming replication). The standby server will
+ over a TCP connection (streaming physical replication). The standby server will
also attempt to restore any WAL found in the standby cluster's
<filename>pg_wal</filename> directory. That typically happens after a server
restart, when the standby replays again WAL that was streamed from the
@@ -772,6 +772,14 @@ archive_cleanup_command = 'pg_archivecleanup /path/to/archive "%r"'
generated, without waiting for the WAL file to be filled.
</para>
+ <note>
+ <para>
+ This discussion of streaming replication assumes physical replication.
+ Although you could treat a logical replication subscriber as a warm standby,
+ it would require some differences to what is described here.
+ </para>
+ </note>
+
<para>
Streaming replication is asynchronous by default
(see <xref linkend="synchronous-replication"/>), in which case there is
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index aa013f348d4..b3faaa675ef 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -6,7 +6,7 @@
<para>
Logical replication is a method of replicating data objects and their
changes, based upon their replication identity (usually a primary key). We
- use the term logical in contrast to physical replication, which uses exact
+ use the term logical replication in contrast to physical replication, which uses exact
block addresses and byte-by-byte replication. PostgreSQL supports both
mechanisms concurrently, see <xref linkend="high-availability"/>. Logical
replication allows fine-grained control over both data replication and
@@ -2496,8 +2496,8 @@ CONTEXT: processing remote data for replication origin "pg_16395" during "INSER
<title>Monitoring</title>
<para>
- Because logical replication is based on a similar architecture as
- <link linkend="streaming-replication">physical streaming replication</link>,
+ Because streaming logical replication is based on a similar architecture as
+ <link linkend="streaming-replication">streaming physical replication</link>,
the monitoring on a publication node is similar to monitoring of a
physical replication primary
(see <xref linkend="streaming-replication-monitoring"/>).
diff --git a/doc/src/sgml/logicaldecoding.sgml b/doc/src/sgml/logicaldecoding.sgml
index d5a5e22fe2c..902f8ce9702 100644
--- a/doc/src/sgml/logicaldecoding.sgml
+++ b/doc/src/sgml/logicaldecoding.sgml
@@ -275,9 +275,9 @@ postgres=# SELECT * from pg_logical_slot_get_changes('regression_slot', NULL, NU
</para>
<note>
- <para><productname>PostgreSQL</productname> also has streaming replication slots
- (see <xref linkend="streaming-replication"/>), but they are used somewhat
- differently there.
+ <para><productname>PostgreSQL</productname> can also use streaming replication slots
+ to maintain a standby server (see <xref linkend="streaming-replication"/>), but
+ typically those use physical replication, not logical.
</para>
</note>
--
2.24.3 (Apple Git-128)
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: pgsql-docs@postgresql.org
Cc: rob@xzilla.net, bruce@momjian.us, pj@illuminatedcomputing.com, laurenz.albe@cybertec.at, pgsql-docs@lists.postgresql.org
Subject: Re: Streaming Replication vs Logical
In-Reply-To: <CAJSLCQ2C8U2E=xK2fnmpcX2uLw1fAR_p7ef+4YdoaBmUf1GDVQ@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox