public inbox for pgsql-general@postgresql.org  
help / color / mirror / Atom feed
From: Masahiko Sawada <sawada.mshk@gmail.com>
To: Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com>
Cc: shveta malik <shveta.malik@gmail.com>
Cc: Amit Kapila <amit.kapila16@gmail.com>
Cc: Michael Paquier <michael@paquier.xyz>
Cc: Wei Wang (Fujitsu) <wangw.fnst@fujitsu.com>
Cc: Yu Shi (Fujitsu) <shiy.fnst@fujitsu.com>
Cc: vignesh C <vignesh21@gmail.com>
Cc: Ajin Cherian <itsajin@gmail.com>
Cc: Runqi Tian <runqidev@gmail.com>
Cc: Peter Smith <smithpb2250@gmail.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>
Cc: li jie <ggysxcq@gmail.com>
Cc: Dilip Kumar <dilipbalaut@gmail.com>
Cc: Alvaro Herrera <alvherre@alvh.no-ip.org>
Cc: Japin Li <japinli@hotmail.com>
Cc: rajesh singarapu <rajesh.rs0541@gmail.com>
Cc: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Cc: Zheng Li <zhengli10@gmail.com>
Subject: Re: Support logical replication of DDLs
Date: Tue, 18 Jul 2023 14:28:08 +0900
Message-ID: <CAD21AoCXCAQ5QyXu9-xs30ViUHtUxQMmf-818d8GX--5pTmZ7g@mail.gmail.com> (raw)
In-Reply-To: <OS0PR01MB5716A47D23EFAF988475D4A99431A@OS0PR01MB5716.jpnprd01.prod.outlook.com>
References: <OSZPR01MB63102C42A24D59FACF6D9CD6FD4A9@OSZPR01MB6310.jpnprd01.prod.outlook.com>
	<CAJpy0uCbwqWj+p_yj1AHyiufzAUv_H29qOaztAXxFoTqZ9WcAw@mail.gmail.com>
	<OSZPR01MB6310E0509F229AF2A9414BB1FD499@OSZPR01MB6310.jpnprd01.prod.outlook.com>
	<CAJpy0uBwCZCniPR6vh26L+wpSf4xzUN8omUa9DzF-x1CAud_xA@mail.gmail.com>
	<CAA4eK1LOr+2O+_pWKTaa0y9vbW6whfm-8-fuBvnS6OBiaR+7TA@mail.gmail.com>
	<CAA4eK1LF3EaCSj5iqO0oT1k3ew7YnQbbKEgbzORDAdvdtd+r7w@mail.gmail.com>
	<CAJpy0uDtrxPo127LH3FP-TffynrspPFqhhC7o_GFOMP+2mPtWQ@mail.gmail.com>
	<OS3PR01MB6275328379FBE5734B4585619E54A@OS3PR01MB6275.jpnprd01.prod.outlook.com>
	<ZIgf3qBvFoTsMhIc@paquier.xyz>
	<CAA4eK1J8WOK9VZ9RpQRNRhRYhFOKQCu=GLCu+iBOePNO3JwLbQ@mail.gmail.com>
	<ZIjmGhPWdoJUfjjE@paquier.xyz>
	<CAJpy0uDaubBHyqPc1k0OysuBYDOVdoUgTWG4jXDCYj-OVSU8hg@mail.gmail.com>
	<CAA4eK1+K8KMsB=+jJO6wDUSt7wF1RiXKtF-HN48nCOEOv-J-3Q@mail.gmail.com>
	<CAJpy0uDLLBYAOzCePYObZ51k1epBU0hef4vbfcujKJprJVsEcQ@mail.gmail.com>
	<CAJpy0uAhLjQZ0Dh0KWDFP8mrnG0rbx99_heavwn8Ke8ZuD-Umg@mail.gmail.com>
	<OS0PR01MB5716A47D23EFAF988475D4A99431A@OS0PR01MB5716.jpnprd01.prod.outlook.com>

On Tue, Jul 11, 2023 at 8:01 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> Hi,
>
> We have been researching how to create a test that detects failures resulting
> from future syntax changes, where the deparser fails to update properly.
>
> The basic idea comes from what Robert Haas suggested in [1]: when running the
> regression test(tests in parallel_schedule), we replace the executing ddl
> statement with the its deparsed version and execute the deparsed statement, so
> that we can run all the regression with the deparsed statement and can expect
> the output to be the same as the existing expected/*.out. As developers
> typically add new regression tests to test new syntax, so we expect this test
> can automatically identify most of the new syntax changes.
>
> One thing to note is that when entering the event trigger(where the deparsing
> happen), the ddl have already been executed. So we need to get the deparsed
> statement before the execution and replace the current statement with it. To
> achieve this, the current approach is to create another database with deparser
> trigger and in the ProcessUtility hook(e.g. tdeparser_ProcessUtility in the
> patch) we redirect the local statement to that remote database, then the
> statment will be deparsed and stored somewhere, we can query the remote
> database to get the deparsed statement and use it to replace the original
> statment.
>
> The process looks like:
> 1) create another database and deparser event trigger in it before running the regression test.
> 2) run the regression test (the statement will be redirect to the remote database and get the deparsed statement)
> 3) compare the output file with the expected ones.
>
> Attach the POC patch(0004) for this approach. Note that, the new test module only
> test test_setup.sql, create_index.sql, create_table.sql and alter_table.sql as
> we only support deparsing TABLE related command for now. And the current test
> cannot pass because of some bugs in deparser, we will fix these bugs soon.
>
>
> TO IMPROVE:
>
> 1. The current approach needs to handle the ERRORs, WARNINGs, and NOTICEs from
> the remote database. Currently it will directly rethrow any ERRORs encountered
> in the remote database. However, for WARNINGs and NOTICEs, we will only rethrow
> them along with the ERRORs. This is done to prevent duplicate messages in the
> output file during local statement execution, which would be inconsistent with
> the existing expected output. Note that this approach may potentially miss some
> bugs, as there could be additional WARNINGs or NOTICEs caused by the deparser
> in the remote database.
>
> 2. The variable reference and assignment (xxx /gset and select :var_name) will
> not be sent to the server(only the qualified value will be sent), but it's
> possible the variable in another session should be set to a different value, so
> this can cause inconsistent output.
>
> 3 .CREATE INDEX CONCURRENTLY will create an invalid index internally even if it
> reports an ERROR later. But since we will directly rethrow the remote ERROR in
> the main database, we won't execute the "CREATE INDEX CONCURRENTLY" in the main
> database. This means that we cannot see the invalid index in the main database.
>
> To improve the above points, another variety is: run the regression test twice.
> The first run is solely intended to collect all the deparsed statements. We can
> dump these statements from the database and then reload them in the second
> regression run. This allows us to utilize the deparsed statements to replace
> the local statements in the second regression run. This approach does not need
> to handle any remote messages and client variable stuff during execution,
> although it could take more time to finsh the test.
>

I've considered some alternative approaches but I prefer the second
approach. A long test time could not be a big problem unless we run it
by default. We can prepare a buildfarm animal that is configured to
run the DDL deparse tests.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com





reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: pgsql-general@postgresql.org
  Cc: sawada.mshk@gmail.com, houzj.fnst@fujitsu.com, shveta.malik@gmail.com, amit.kapila16@gmail.com, michael@paquier.xyz, wangw.fnst@fujitsu.com, shiy.fnst@fujitsu.com, vignesh21@gmail.com, itsajin@gmail.com, runqidev@gmail.com, smithpb2250@gmail.com, tgl@sss.pgh.pa.us, ggysxcq@gmail.com, dilipbalaut@gmail.com, alvherre@alvh.no-ip.org, japinli@hotmail.com, rajesh.rs0541@gmail.com, pgsql-hackers@lists.postgresql.org, zhengli10@gmail.com
  Subject: Re: Support logical replication of DDLs
  In-Reply-To: <CAD21AoCXCAQ5QyXu9-xs30ViUHtUxQMmf-818d8GX--5pTmZ7g@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox