Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1qLdGX-000Itx-SH for pgsql-hackers@arkaria.postgresql.org; Tue, 18 Jul 2023 05:28:53 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1qLdGV-004ARO-0h for pgsql-hackers@arkaria.postgresql.org; Tue, 18 Jul 2023 05:28:51 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1qLdGU-004ARF-Jr for pgsql-hackers@lists.postgresql.org; Tue, 18 Jul 2023 05:28:50 +0000 Received: from mail-lj1-x232.google.com ([2a00:1450:4864:20::232]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1qLdGR-00040E-Um for pgsql-hackers@lists.postgresql.org; Tue, 18 Jul 2023 05:28:49 +0000 Received: by mail-lj1-x232.google.com with SMTP id 38308e7fff4ca-2b70357ca12so16782971fa.1 for ; Mon, 17 Jul 2023 22:28:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1689658125; x=1690262925; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=wJwUMjQQlWpITaEeDWnPvXD3xMkgoSKK5s6Zy2hdw+8=; b=i15ti8jbYNafK/042SQmFu+2JfGlnHtINMRduuJxZzjo5QdhwAyQshBCO+5lxDQh8/ YJgRIF+m4m2sEpVB7ZR4W4YxpNbcDC24hmTXniGUGBo4DZoyURNSjfW1ZZNU9KjHZ4hO CFoqByB0TQiaRmf61DZijLgpAm+DnSjArc0V8eiQZ9bvK90RbCKjIcLKYZT0JJKUrYZ5 2Rg8MH9CkJB12r/+PFKO2cW/Q8fq8TA8ftuVG2c7IPRPq9yZ5colhrMGdPdo+tXBg85t gk+f8lc4Sl6CO0ECKPO0YNiPnFeJ+3dQ70v21+vB/8VxLb6JUr/GXTUqnND504YG0v/9 sI9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689658125; x=1690262925; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wJwUMjQQlWpITaEeDWnPvXD3xMkgoSKK5s6Zy2hdw+8=; b=DEhhVwmck7qzFKoa0TTeUruwaurtTakPmxk3/Kd1jf3pdrI5yrNZP/20A9rkBnZaHE 07kUU0oOlLGEnCbuyEH3u087WlJINUcVII0scrLoUUbOwNun8I3w0zYa2DeYb0Gq0y3a kdywmoNL61avkVzlwkMS6o+pNGttmXCn6G0LBIRb05Nvi/ymxdY2PDIami6/0xdqJioB UGOCpIkTYqSEFrhAgVvAjp7VMHxxT3Zl/nUGNIjpGhmPMDewk0+a2lBVu1NWbqstwCFq f7Y0IseNQuCHZ878WU7bYvbm/IseNtyiDsJEDS9Lja4/JddncFsFS58EhMMxRvjN0Mnn rw+A== X-Gm-Message-State: ABy/qLYaFTBNfPOpqtWBfIGQex6BwejryBkeAlK54wDWA5sI8J/7Tu4t tzfHpsOc0wzdnYztA5N2Ncz7vRx0tnerbSc4T6M= X-Google-Smtp-Source: APBJJlGISOft3IhaRSTTbWFl5EW/djZdrjiYJ5PiR3Dy10jc+hCY/mKHKQhy55cFhMDEbpJDznX2qN3HMjCo1AGj3CA= X-Received: by 2002:a05:651c:b1e:b0:2b6:7c3a:6adc with SMTP id b30-20020a05651c0b1e00b002b67c3a6adcmr6325639ljr.5.1689658125285; Mon, 17 Jul 2023 22:28:45 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Masahiko Sawada Date: Tue, 18 Jul 2023 14:28:08 +0900 Message-ID: Subject: Re: Support logical replication of DDLs To: "Zhijie Hou (Fujitsu)" Cc: shveta malik , Amit Kapila , Michael Paquier , "Wei Wang (Fujitsu)" , "Yu Shi (Fujitsu)" , vignesh C , Ajin Cherian , Runqi Tian , Peter Smith , Tom Lane , li jie , Dilip Kumar , Alvaro Herrera , Japin Li , rajesh singarapu , PostgreSQL Hackers , Zheng Li Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Tue, Jul 11, 2023 at 8:01=E2=80=AFPM Zhijie Hou (Fujitsu) wrote: > > Hi, > > We have been researching how to create a test that detects failures resul= ting > from future syntax changes, where the deparser fails to update properly. > > The basic idea comes from what Robert Haas suggested in [1]: when running= the > regression test(tests in parallel_schedule), we replace the executing ddl > statement with the its deparsed version and execute the deparsed statemen= t, so > that we can run all the regression with the deparsed statement and can ex= pect > the output to be the same as the existing expected/*.out. As developers > typically add new regression tests to test new syntax, so we expect this = test > can automatically identify most of the new syntax changes. > > One thing to note is that when entering the event trigger(where the depar= sing > happen), the ddl have already been executed. So we need to get the depars= ed > statement before the execution and replace the current statement with it.= To > achieve this, the current approach is to create another database with dep= arser > trigger and in the ProcessUtility hook(e.g. tdeparser_ProcessUtility in t= he > patch) we redirect the local statement to that remote database, then the > statment will be deparsed and stored somewhere, we can query the remote > database to get the deparsed statement and use it to replace the original > statment. > > The process looks like: > 1) create another database and deparser event trigger in it before runnin= g the regression test. > 2) run the regression test (the statement will be redirect to the remote = database and get the deparsed statement) > 3) compare the output file with the expected ones. > > Attach the POC patch(0004) for this approach. Note that, the new test mod= ule only > test test_setup.sql, create_index.sql, create_table.sql and alter_table.s= ql as > we only support deparsing TABLE related command for now. And the current = test > cannot pass because of some bugs in deparser, we will fix these bugs soon= . > > > TO IMPROVE: > > 1. The current approach needs to handle the ERRORs, WARNINGs, and NOTICEs= from > the remote database. Currently it will directly rethrow any ERRORs encoun= tered > in the remote database. However, for WARNINGs and NOTICEs, we will only r= ethrow > them along with the ERRORs. This is done to prevent duplicate messages in= the > output file during local statement execution, which would be inconsistent= with > the existing expected output. Note that this approach may potentially mis= s some > bugs, as there could be additional WARNINGs or NOTICEs caused by the depa= rser > in the remote database. > > 2. The variable reference and assignment (xxx /gset and select :var_name)= will > not be sent to the server(only the qualified value will be sent), but it'= s > possible the variable in another session should be set to a different val= ue, so > this can cause inconsistent output. > > 3 .CREATE INDEX CONCURRENTLY will create an invalid index internally even= if it > reports an ERROR later. But since we will directly rethrow the remote ERR= OR in > the main database, we won't execute the "CREATE INDEX CONCURRENTLY" in th= e main > database. This means that we cannot see the invalid index in the main dat= abase. > > To improve the above points, another variety is: run the regression test = twice. > The first run is solely intended to collect all the deparsed statements. = We can > dump these statements from the database and then reload them in the secon= d > regression run. This allows us to utilize the deparsed statements to repl= ace > the local statements in the second regression run. This approach does not= need > to handle any remote messages and client variable stuff during execution, > although it could take more time to finsh the test. > I've considered some alternative approaches but I prefer the second approach. A long test time could not be a big problem unless we run it by default. We can prepare a buildfarm animal that is configured to run the DDL deparse tests. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com