Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1o1Gg5-0002bS-8c for pgsql-hackers@arkaria.postgresql.org; Wed, 15 Jun 2022 00:14:33 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1o1Gg4-0003uN-5N for pgsql-hackers@arkaria.postgresql.org; Wed, 15 Jun 2022 00:14:32 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1o1Gg3-0003uE-PN for pgsql-hackers@lists.postgresql.org; Wed, 15 Jun 2022 00:14:31 +0000 Received: from mail-oi1-x232.google.com ([2607:f8b0:4864:20::232]) by magus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1o1Gg0-0002iG-GO for pgsql-hackers@lists.postgresql.org; Wed, 15 Jun 2022 00:14:31 +0000 Received: by mail-oi1-x232.google.com with SMTP id q11so13569178oih.10 for ; Tue, 14 Jun 2022 17:14:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=pQF6fJp58Iqm4QvP91J65LqtJ09dxQAEL+oLdA0WuWQ=; b=kMFDCRCgtk8rRlgiFxiPzf7XFlmbfwfMqWBkBES9I7lQI2XJALJHAlVWpNubWmmXuK +C9kD4bazALrOvPGyqwhq85SIeua1e2/qLIBN7ksR1ej/dJpdwSQoD9aOnNdxwt0mfbp nmctG02OY//QntT5VWBkZYCcBTMSfkWZQEfvstvHfx+mfcu0WxdDxpRt8kTMSZW/iKh9 dpTPW0fHkKxmjqtUM38OjwM5VJwrOeD+m3KHwCYVFGK9qec/aDmeBVJePyqR79z/DklH 7ItJkvuB+IqWbSzm2FsHrTZ644XC4KstY/KxECzP+GOvQrR1N4/81MqTxf1V5YoNjkdL 7qYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=pQF6fJp58Iqm4QvP91J65LqtJ09dxQAEL+oLdA0WuWQ=; b=rOzy7fLugH+kf+2DamIsszG/8UAIB1/t37nAMrK70oQnAtsW18ulHtosMjjeGvgDvr dhrsSy5qmv+UWO/V8MHzIMfvAoUOjI6S0TNWnQATjo8+OwDCJjATVmMSpy5e2rSzYpUW uJeXyLkg0GwWas1YIiF1ve8R9WfqVN6XSQITFrecRliUKyqtDi0Z48SVyzKiZdOveWVD w0cX/8Q3+nU1kYxU1VtMtLPuHKbX7zIX6QRYOUmgoXCK/S3fZIwjrYbN0yJ348QjjnAB UZ4uJZeaM48ZOLCi2PV5cqtp9Bx/tovme+uqFQt5xTP8iFVG5+TrN9NhP+vC6Zp9vRxT UYxw== X-Gm-Message-State: AOAM5313ZVJ7E3EQsrfuNN4MtscoN6OqUZhKeG94JHYIg/q4p+yVeiYi GX5o/M0wcSw5UeMCnBUIyokgxPTlqg53eTy38F+M7Mu29/AS0A== X-Google-Smtp-Source: ABdhPJxRN6Xk+pCxgImZO7PeD4dq5en160BeTgsE9sx74wrS11AQhk14TJRqw4Kuw8Xk+IbK+QQzBwSDtjR8ILax0PM= X-Received: by 2002:a05:6808:2d6:b0:32f:7da:60b9 with SMTP id a22-20020a05680802d600b0032f07da60b9mr3556524oid.249.1655252065793; Tue, 14 Jun 2022 17:14:25 -0700 (PDT) MIME-Version: 1.0 References: <202203162206.7spggyktx63e@alvherre.pgsql> In-Reply-To: From: Zheng Li Date: Tue, 14 Jun 2022 20:14:14 -0400 Message-ID: Subject: Re: Support logical replication of DDLs To: "houzj.fnst@fujitsu.com" Cc: Masahiko Sawada , Amit Kapila , Japin Li , Alvaro Herrera , Dilip Kumar , rajesh singarapu , PostgreSQL Hackers Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk > Thanks for providing this idea. > > I looked at the string that is used for replication: > > """ > {ALTERTABLESTMT :relation {RANGEVAR :schemaname public :relname foo > :inh true :relpersistence p :alias <> :location 12} :cmds ({ALTERTABLECMD > :subtype 0 :name <> :num 0 :newowner <> :def {COLUMNDEF :colname b > :typeName {TYPENAME :names ("public" "timestamptz") :typeOid 0 :setof > false :pct_type false :typmods <> :typemod -1 :arrayBounds <> :location > 29} :compression <> :inhcount 0 :is_local true :is_not_null false > :is_from_type false :storage <> :raw_default <> :cooked_default <> > :identity <> :identitySequence <> :generated <> :collClause <> :collOid 0 > :constraints <> :fdwoptions <> :location 27} :behavior 0 :missing_ok > false}) :objtype 41 :missing_ok false} > """ > > I think the converted parsetree string includes lots of internal > objects(e.g typeOid/pct_type/objtype/collOid/location/...). These are > unnecessary stuff for replication and we cannot make sure all the internal > stuff are consistent among pub/sub. So I am not sure whether replicating > this string is better. > > Besides, replicating the string from nodetostring() means we would need to > deal with the structure difference between the publisher and the > subscriber if any related structure has been changed which seems not good. Yeah, this existing format is not designed to be portable between different major versions. So it can't directly be used for replication without serious modification. > IMO, The advantages of the deparsing approach(as implemented in the POC > patch set[1]) are: > > 1) We can generate a command representation that can be > parsed/processed/transformed arbitrarily by the subscriber using generic > rules it(for example: user can easily replace the schema name in it) while > the results of nodetostring() seems not a standard json string, so I am > not sure can user reuse it without traversing the parsetree again. > > 2) With event_trigger + deparser, we can filter the unpublished objects > easier. For example: "DROP TABLE table_pub, table_unpub;". We can deparse > it into two commands "DROP TABLE table_pub" and "DROP TABLE table_pub" and > only publish the first one. > > 3) With deparser, we are able to query the catalog in the deparser to > build a complete command(filled with schemaname...) which user don't need > to do any other work for it. We don't need to force the subscriber to set > the same search_path as the publisher which give user more flexibility. > > 4) For CREATE TABLE AS, we can separate out the CREATE TABLE part with the > help of deparser and event trigger. This can avoid executing the subquery > on subcriber. > > 5) For ALTER TABLE command. We might want to filter out the DDL which use > volatile function as discussed in [2]. We can achieve this easier by > extending the deparser to check the functions used. We can even rebuild a > command without unsupported functions to replicate by using deparser. > > There may be more cases I am missing as we are still analyzing other DDLs. How does the deparser deparses CREATE FUNCTION STATEMENT? Will it schema qualify objects inside the function definition? While I agree that the deparser is needed to handle the potential syntax differences between the pub/sub, I think it's only relevant for the use cases where only a subset of tables in the database are replicated. For other use cases where all tables, functions and other objects need to be replicated, (for example, creating a logical replica for major version upgrade) there won't be any syntax difference to handle and the schemas are supposed to match exactly between the pub/sub. In other words the user seeks to create an identical replica of the source database and the DDLs should be replicated as is in this case. So I think it's an overkill to use deparser for such use cases. It also costs more space and time using deparsing. For example, the following simple ALTER TABLE command incurs 11 times more space in the WAL record if we were to use the format from the deparser, there will also be time and CPU overhead from the deparser. ALTER TABLE t1 ADD c INT; serach_path: "$user", public VS {\"fmt\": \"ALTER TABLE %{identity}D %{subcmds:, }s\", \"subcmds\": [{\"fmt\": \"ADD COLUMN %{definition}s\", \"type\": \"add column\", \"definiti on\": {\"fmt\": \"%{name}I %{coltype}T %{default}s %{not_null}s %{collation}s\", \"name\": \"c\", \"type\": \"column\", \"coltype\": {\"typmod\": \"\", \"typarray \": false, \"typename\": \"int4\", \"schemaname\": \"pg_catalog\"}, \"default\": {\"fmt\": \"DEFAULT %{default}s\", \"present\": false}, \"not_null\": \"\", \"col lation\": {\"fmt\": \"COLLATE %{name}D\", \"present\": false}}}], \"identity\": {\"objname\": \"t1\", \"schemaname\": \"public\"}} So I think it's better to define DDL replication levels [1] to tailor for the two different use cases. We can use different logging format based on the DDL replication level. For example, we can simply log the DDL query string and the search_path for database level DDL replication. But for table level DDL replication we need to use the deparser format in order to handle the potential syntax differences and schema mapping requests. [1] https://www.postgresql.org/message-id/CAAD30U%2BpVmfKwUKy8cbZOnUXyguJ-uBNejwD75Kyo%3DOjdQGJ9g%40mail.gmail.com Thoughts? With Regards, Zheng