Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wHbzl-007Lo8-1f for pgsql-hackers@arkaria.postgresql.org; Tue, 28 Apr 2026 06:32:34 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wHbzk-00GSvu-1U for pgsql-hackers@arkaria.postgresql.org; Tue, 28 Apr 2026 06:32:32 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wHbzk-00GSvm-0R for pgsql-hackers@lists.postgresql.org; Tue, 28 Apr 2026 06:32:32 +0000 Received: from mail-lj1-x229.google.com ([2a00:1450:4864:20::229]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wHbzh-00000003TrB-40s4 for pgsql-hackers@lists.postgresql.org; Tue, 28 Apr 2026 06:32:31 +0000 Received: by mail-lj1-x229.google.com with SMTP id 38308e7fff4ca-38e7d984096so121507241fa.2 for ; Mon, 27 Apr 2026 23:32:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1777357949; cv=none; d=google.com; s=arc-20240605; b=TVT91QbFvKTXfzehBD76Jzrx7w3KO72J23dKr05u0RJxpcSYObENnMUntc3gglqRmP +DK9bu1JWf6BrHeIprLTASmIfoNfgV73jGBN+gRnTOD9cjXFzyu2mUIKrQdMyugTXl5x OF7NbTaAgkJD0GvzIAZpy7+Q5f7df5EpIhh+CoW+dXmFuBFjnImCnXKHf/rm362NZwXy /Sk2w1V3VzM5ksBncHHtqrWvnU4JFtJBkGuGq7lL6qJ31WG7Ckl+tOMc6AQopG8khmZh FN3K2OnSSDYWd0H/mUqjT1idvDLz/TulnqGEDqtrfBgfqerdlU6iEui4DKC6zxnerZEc fntQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=7PpgAS5tb2brLYOYEUzLcsBf9mi5p5QwCP0q9vC4064=; fh=CV2QEp0EDcQTZJQ6dFAupgquvTFtkIsEnlCxoNQHadA=; b=QtVZluTC3f4akayfBAlZsu+TjU3lqv3IRLblDm+hv3kE3aoC/tPpOCmguBmQIFSfmc uVcJG0b1SwTUU1YEnB2Gnu/E7d9A4WUsf5/4FjNeHjjZtE/rwIUVH0XOB8Z8aJtnviGk g6skcOBZEjxDrUC5hXzMcitTqQpJv8kfCDSaLMMbINl7mbnJG7f03I4rx/P9NT5VlIYX 6nJ0Jt+yExw/qbZK4LJIPHaiC2puqaAtKxzbjQhQMbNZsOM9jza1gp1rEld01aPbriPW ZaCCOVteOKy0ap4fhMVIiPwP7a3sYKr94eNdXjBx/rBRrbIWok7s18eLM37CNcMfJc8i hqQg==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1777357949; x=1777962749; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=7PpgAS5tb2brLYOYEUzLcsBf9mi5p5QwCP0q9vC4064=; b=LwnzorOk3vaBlCFtxW6I1cv9cDW1IvTIwPBpKYTtm2qKOqpYc1AQ6kWcdOkvqtDziK iLB0vAUBlhJR6DGF8RmRCBI5WBYsZ2UaO9jS5DSfnQXpUQpfG1W+FiQa3KxnSKDKX0lP MTNskcNNXL5g1Jnl4hWSyvI319rDCxx72XUe9v8a360sV1SZXuTQcNZPg94r2mANMrqz uo85syq2syCOgSKYQfwwyankWVPvW9p4tJrsDQxEUopuCvrZ/wo9luKy8NYI6+nn17iV XLuNpejynR4jL8Ut8KSORaaFa3pjfy0RXUH7pz+QEp2VYdwhLwwfIqXhXsEIjmr3Y33t ySSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777357949; x=1777962749; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=7PpgAS5tb2brLYOYEUzLcsBf9mi5p5QwCP0q9vC4064=; b=aj+zDuWJzstiJ6+pQGnO6tX1MSOPuMw1HR0MvsCud/jr3RMTUtUEm/APMNracSJk23 hZRgn2sV3ESrqT19fTPBtimv5GkCxlM9NBHKnPxFA024wN65llsQfWj3YmfV1OgjnXhM CMVfZuewyIybNO9UhSVLMVw8s6ePHjMtutwfEAMBkcwhN8wa9LPUoCapJnowQxa9WqV1 uhO3MBSE67zlr9Z0raj+H8bJUmSqqgqw6JYFqZ96ZGHq61SPovP8lQRrcAaTjTg8oDaB 2uLpXujJuo1LwBA/1+4qXjNkU6UWMGzedwlqzln5+mtZjlPGyyGR+1Z7wEviAQpcYcg3 6A6w== X-Forwarded-Encrypted: i=1; AFNElJ9sJnJewqnChwFxEcyRlBYx4t+i6nnWrxFfvDUL1k+JO9rRIaqxKuYtVZ6435JvpWseA8uLzl+rwztI2phZ@lists.postgresql.org X-Gm-Message-State: AOJu0YyEjPdtIllcd+fpsmMkvLXwAsKCOjgYI8KyJMe+6kVts4S4UgRN H3X9YKB0k92ZDM6KUSbv466EYN4yAkDIuzZKAFXmr3wNRBDI1/ePOtmsSBtzbeXIpMJblrBii8n iRZADUxQTJndFhF8dgEDEuiBo1blXom8= X-Gm-Gg: AeBDiesEPxAhWZJqRiO06qyNPs++4QHWt9U+B5k7ThgsSA6jg8iX7P2LPlAvQWmXAbd 2YjyvUnMdfjflmcVLzwUTODg5zIQnT1+rlenvuI55uQTs7AEdvDA303Et7YRuA5g+5IX+dRB//v YKC69loq3oLKwD6KcD+/XAmU8AF9pUnKpXKNrcrVcW0qTcePoWFD8QFdWQrMMoXzAvROutIkT2e QcnHgTvrgHHGVxxsWYRrnfOAVIH1vB7u7u4p70MtyuI0dpy5UiK+Y8OGDxuKpt60UFs3OC03oZk DFWQh0jHzbhZLiYtQgH5WRMM2Qd/HgiNUGUaUVLwR/Ztkj2TzeA6y6SxpecwsHKs2eI7ISZs9sp heea3Aic= X-Received: by 2002:a2e:8a94:0:b0:38f:20cc:2bb9 with SMTP id 38308e7fff4ca-39240cc9b49mr3822811fa.5.1777357948423; Mon, 27 Apr 2026 23:32:28 -0700 (PDT) MIME-Version: 1.0 References: <080f9394-c127-4cef-865e-10f2f997125c@postgrespro.ru> <38690b0e-f91b-46fa-b72a-57775612e463@postgrespro.ru> In-Reply-To: From: Amit Kapila Date: Tue, 28 Apr 2026 12:02:17 +0530 X-Gm-Features: AVHnY4LucH_qOhILsUAeTLwdv2lGzv2A-r1eRxVVwl40AnhUzLArNL66rrB3hds Message-ID: Subject: Re: Support logical replication of DDLs, take2 To: Masahiko Sawada Cc: Vitaly Davydov , Ashutosh Bapat , PostgreSQL Hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Tue, Apr 21, 2026 at 4:45=E2=80=AFAM Masahiko Sawada wrote: > > On Mon, Feb 23, 2026 at 5:21=E2=80=AFPM Masahiko Sawada wrote: > > > > > > One idea I'm experimenting with is that we define an abstract data > > type that can represent a DDL (like CollectedCommand) and write it to > > a new WAL record so that logical decoding processes it. For CREATE > > DDLs, we can use pg_get_xxx_def() function while using a historical > > snapshot to get the DDLs. We would need to implement the codes to > > generate DROP and ALTER DDLs from the data. I believe DROP DDLs would > > not be hard. For ALTER DDLs, we would incur the initial implementation > > costs, but we would not change these codes often. > > > > DDL support for logical replication is one of the biggest missing > pieces in logical replication. I'd like to resume this work for PG20. > > We made a lot of effort on this feature through 2022 and 2023, but the > development is currently inactive. The last patch was submitted on Jul > 18, 2023. I've reviewed the previous patches and discussions, and I > would like to summarize how DDL replication was implemented, the main > reasons it stalled, and propose an alternative design to address those > problems. > > The overall idea of the previous patch set was to implement DDL > deparsing and utilize it for DDL replication. It converted a parse > tree into a JSON string. For instance, if a user executes "DROP TABLE > t1", the deparser generates from its parse tree: > > {DROPSTMT :objects (("t1")) :removeType 41 :behavior 0 :missing_ok > false :concurrent false} > > to: > > {"fmt": "DROP TABLE %{objidentity}s", "objidentity": "public.t1"} > > This JSON string is self-documenting, meaning someone who gets it can > easily reconstruct the original DDL with schema-qualified object > names. In a dedicated event trigger for logical replication, we > deparsed the parse tree of a DDL, wrote it into a WAL record, and then > the logical decoding processed it similarly to DML changes. > > While there are several benefits to the JSON data approach mentioned > in the wiki [1] -- most notably the flexibility to easily remap > schemas (e.g., mapping "schema A" on the publisher to "schema B" on > the subscriber) -- there was a major concern: the huge maintenance > burden. > Yes, there will be a maintenance cost of JSON-based deparsing approach. But note that multiple senior people (Alvaro Herrera, Robert Haas) [1] seems to favor that approach. So, I am not sure we can conclude to abandon that approach without those people or some other senior people agreeing to abandon it. To be clear, I am not against considering a new/different approach for DDL replication but just that it is not clear that old/existing approach can be ruled out without more discussion on it, We would need to maintain the JSON serialization code whenever > creating or modifying parse nodes, regardless of whether the changes > were related to DDL replication. IIUC, this was the primary reason the > feature didn't cross the finish line. > > Additionally, I think there is another design issue: it is not > output-plugin agnostic. Since the deparsed DDL was written by a > logical-replication-specific event trigger, third-party logical > decoding plugins cannot easily detect DDL events. > Why RmgrId like RM_LOGICALDDLMSG_ID and XLOG_LOGICAL_DDL_MESSAGE wal info is not sufficient for this? Decoder will add a message like REORDER_BUFFER_CHANGE_DDL which can be used to detect DDL message, no? > Ideally, we should > write DDL information into a WAL record natively when > wal_level=3D'logical' (or additionally when a GUC enables DDL events > WAL-logging) so that all decoding plugins can detect them. This also > allows us to test DDL logical decoding with test_decoding without > setting up a full logical replication subscription. > > To address these two points, I'd like to propose an alternative > approach: we introduce a new data type, say DDLCommand, that is > self-contained to represent a DDL (like CollectedCommand), and don't > rely on event triggers. It would have the command type (and subtype if > required), the OIDs of the target object and its namespace, and the > OID of the user who executed the DDL. We write it to a new WAL record > at appropriate places during DDL execution, and the logical decoding > layer passes the data to output plugins. That way, any logical > decoding plugin can detect DDL changes, and it's up to the plugins how > to decode the DDL information. > > In pgoutput, for CREATE DDLs, we can use the pg_get_xxx_ddl() > functions while using a historical snapshot to get the DDLs, saving > maintenance costs. We would still need to implement the code to > generate DROP and ALTER DDLs from the data. I believe DROP DDLs would > not be hard. For ALTER DDLs, we would incur an initial implementation > cost, but we would not need to change this code often. We can > implement the DDL generation code in a way that improves ddlutils.c. > > Also, because DDLCommand is separated from parse nodes, we only need > to change the DDL deparse/replication code when it is actually needed. > Additionally, this approach would eliminate the code around the > two-step process (using DCT_TableDropStart and DCT_TableDropEnd) for > DROP TABLE. While it would miss the flexibility benefits that the JSON > deparsing approach has, I guess it would not be very hard to implement > the mapping in the deparse layer even without the JSON data. > Possible but the point was flexibility and ease with which users can implement mapping with JSON approach. > > FYI I've experimented with auto-generation approaches too. For > instance, gen_node_support.pl generates C code that converts parse > nodes to the corresponding text representations. Or > gen_node_support.pl generates C code that makes all objects in the > given SQL query text fully-schema qualified. While these ideas are > promising they didn't help reduce the maintenance burden much as the > parse node definitions are already complex and vary on nodes much. > Yeah, this is my recollection of a previous attempt for auto_generating the deparsing code. [1]: https://www.postgresql.org/message-id/CA%2BTgmoauXRQ3yDZNGTzXv_m1kdUnH= 1Ww%2BhwKmKUSjtyBh0Em2Q%40mail.gmail.com --=20 With Regards, Amit Kapila.