Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pggUg-00078B-Ib for pgsql-hackers@arkaria.postgresql.org; Mon, 27 Mar 2023 06:38:14 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1pggUf-0001PT-Dm for pgsql-hackers@arkaria.postgresql.org; Mon, 27 Mar 2023 06:38:13 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pggUf-0001PJ-1f for pgsql-hackers@lists.postgresql.org; Mon, 27 Mar 2023 06:38:13 +0000 Received: from mail-oi1-x22c.google.com ([2607:f8b0:4864:20::22c]) by magus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1pggUb-0007JU-Po for pgsql-hackers@lists.postgresql.org; Mon, 27 Mar 2023 06:38:12 +0000 Received: by mail-oi1-x22c.google.com with SMTP id bk5so5531580oib.6 for ; Sun, 26 Mar 2023 23:38:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1679899087; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=dtXDrONNoUH1LYaw28uYWKL+C8oRgSppxzEm6SyxfKA=; b=PCtA0EMzheiMpSwu9svf5snPrn2mDsZy420wegyZnTo0Wf103JPCq7poemb4uAJlX4 iPOrguVG5R8plq7EuH+hy/UnNzZU3KM8ye7JLUtXY3C0yv6l9yKc7UposrSvFz4FcA76 7mWHJIeClbdCo9/oyLjecEMijjI+pJjh8KkMLtAn2UsDZqY/zSKkEQ+V5//sa1RChniX xZ/sy6O8Czdc7fsWz22DGockcwdSeSpxsVUUjwl0WthOaWN+imY0Qq647On6a55zV73X YuJSwaFETf5hh1iTEz/kNDlCL/C91aOTAQfmm8h9agHEtUNUGZnIMaEeEM6LgSt7/aN0 q0tw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679899087; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dtXDrONNoUH1LYaw28uYWKL+C8oRgSppxzEm6SyxfKA=; b=hW+NmOb9aflrnrN6HimeH/QFUY+B53KqwT+Xcr1kQ5L0oFH6i0Bvz3v8l2FpMh+mez 1DAF2wuShsSTFl2CnTNU7VZrbkwlDZP/q2ON4jlVYSwW0uyeiOgAyxAlh+ZtQbnsjwvS vUk1u4UADJXjzvnajWotSR13t/0Y9ONGMZFQPLc4yjb+Impg36v6f0Oqd5rgmWNFyzNQ 8trSKFdl/QYHNlYiGyvo3JSCH/RcbF8Eg3NbWdNu5iB5x9aRYOgFyXcvIJWPxxa9wM2U wuroNQ0MJ7P6x7M93w3+fEw0YRu11ENm2b/jtn6UgFYgN3WuFbMxALG2r19mPC2DTXCV WaNA== X-Gm-Message-State: AO0yUKXWgX68ui3fhnjv89xDuqyvRlsYLmkFYgrmWvDd4BpEUtcSmKOP m+PzoLelyn/4Je3VekUPWSDTGux4SHkjjB6ROuY= X-Google-Smtp-Source: AK7set/WrrioOR+HKYD/OduYJegmZWSiP1WvtdZuQuz75yBqEXZc7pNwNWQIcAL+slcHl1fc4z10fZIbmZkof70uyOU= X-Received: by 2002:aca:e0c5:0:b0:386:a6a7:a640 with SMTP id x188-20020acae0c5000000b00386a6a7a640mr2402729oig.2.1679899087366; Sun, 26 Mar 2023 23:38:07 -0700 (PDT) MIME-Version: 1.0 References: <3032112.1679865718@sss.pgh.pa.us> In-Reply-To: <3032112.1679865718@sss.pgh.pa.us> From: Amit Kapila Date: Mon, 27 Mar 2023 12:07:55 +0530 Message-ID: Subject: Re: Support logical replication of DDLs To: Tom Lane Cc: vignesh C , Ajin Cherian , "houzj.fnst@fujitsu.com" , "wangw.fnst@fujitsu.com" , Runqi Tian , Peter Smith , li jie , Dilip Kumar , Alvaro Herrera , Masahiko Sawada , Japin Li , rajesh singarapu , PostgreSQL Hackers , Zheng Li Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Mon, Mar 27, 2023 at 2:52=E2=80=AFAM Tom Lane wrote: > > vignesh C writes: > > [ YA patch set ] > ... > > I'm also less than sold on the technical details, specifically > the notion of "let's translate utility parse trees into JSON and > send that down the wire". You can probably make that work for now, > but I wonder if it will be any more robust against cross-version > changes than just shipping the outfuncs.c representation. (Perhaps > it can be made more robust than the raw parse trees, but I see no > evidence that anyone's thought much about how.) > AFAIR, we have discussed this aspect. For example, in email [1] and other follow-on emails, there is some discussion on the benefits of using JSON over outfuncs.c. Then also various senior members seem to be in favor of using JSON format because of the flexibility it brings [2]. The few points that I could gather from the discussion are as follows: (a) it is convenient to transform the JSON format, for example, if one wants to change the schema in the command before applying it on the downstream node; (b) parse-tree representation would be less portable across versions as compared to JSON format, say if the node name or some other field is changed in the parsetree; (c) a JSON format string would be easier to understand for logical replication consumers which don't understand the original parsetree; (d) as mentioned in [1], we sometimes need to transform the command into multiple sub-commands or filter part of it which I think will be difficult to achieve with parsetree and outfuncs.c. > And TBH, I don't think that I quite believe the premise in the > first place. The whole point of using logical rather than physical > replication is that the subscriber installation(s) aren't exactly like > the publisher. Given that, how can we expect that automated DDL > replication is going to do the right thing often enough to be a useful > tool rather than a disastrous foot-gun? > One of the major use cases as mentioned in the initial email was for online version upgrades. And also, people would be happy to automatically sync the schema for cases where the logical replication is set up to get a subset of the data via features like row filters. Having said that, I agree with you that it is very important to define the scope of this feature if we want to see it becoming reality. The more you expand the scope > of what gets replicated, the worse that problem becomes --- for > example, I don't buy for one second that "let's replicate roles" > is a credible solution for the problems that come from the roles > not being the same on publisher and subscriber. > > I'm not sure how we get from here to a committable and useful feature, > but I don't think we're close to that yet, and I'm not sure that minor > iterations on a 2MB patchset will accomplish much. I'm afraid that > a whole lot of work is going to end up going down the drain, which > would be a shame because surely there are use-cases here. > I think the idea was to build a POC to see what kind of difficulties we may face down the road. I also don't think we can get all of this in one version or rather some of this may not be required at all but OTOH it gives us a good idea of problems we may need to solve and allow us to evaluate if the base design is extendable enough. > I suggest taking a couple of steps back from the minutiae of the > patch, and spending some hard effort thinking about how the thing > would be controlled in a useful fashion (that is, a real design for > the filtering that was mentioned at the very outset), and about the > security issues, and about how we could get to a committable patch. > Agreed. I'll try to summarize the discussion we have till now on this and share my thoughts on the same in a separate email. Thanks for paying attention to this work! [1] - https://www.postgresql.org/message-id/OS0PR01MB571684CBF660D05B63B441= 2C94AB9%40OS0PR01MB5716.jpnprd01.prod.outlook.com [2] - https://www.postgresql.org/message-id/CA%2BTgmoauXRQ3yDZNGTzXv_m1kdUn= H1Ww%2BhwKmKUSjtyBh0Em2Q%40mail.gmail.com --=20 With Regards, Amit Kapila.