Date: Wed, 22 Jan 2020 00:41:41 +0100
From: Tomas Vondra <tomas.vondra@2ndquadrant.com>
To: Pavel Stehule <pavel.stehule@gmail.com>
Cc: Philippe BEAUDOIN <phb07@apra.asso.fr>,
	PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Subject: Re: proposal: schema variables
Message-ID: <20200121234141.se4v2hygsiy3lffx@development>
References: 
 <CAFj8pRDY+m9OOxfO10R7J0PAkCCauM-TweaTrdsrsLGMb1VbEQ@mail.gmail.com>
 <CAFj8pRAcb+PLKJJOGn7POtcTuH8DaKqZNid-7UvB9_BtgshHEQ@mail.gmail.com>
 <157772317031.1198.14690129684698137065.pgcf@coridan.postgresql.org>
 <CAFj8pRA+SYbegM6f5OTamwhMdsVXB22mbmkscaErrhYge8ji3A@mail.gmail.com>
 <CAFj8pRCAGXK9Wd8FGboFkFyUTWwh0qJUKW0muVAMkAUq6feCdA@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Disposition: inline
In-Reply-To: 
 <CAFj8pRCAGXK9Wd8FGboFkFyUTWwh0qJUKW0muVAMkAUq6feCdA@mail.gmail.com>
Precedence: bulk

Hi,

I did a quick review of this patch today, so let me share a couple of
comments.

Firstly, the patch is pretty large - it has ~270kB. Not the largest
patch ever, but large. I think breaking the patch into smaller pieces
would significantly improve the chnce of it getting committed.

Is it possible to break the patch into smaller pieces that could be
applied incrementally? For example, we could move the "transactional"
behavior into a separate patch, but it's not clear to me how much code
would that actually move to that second patch. Any other parts that
could be moved to a separate patch?

I see one of the main contention points was a rather long discussion
about transactional vs. non-transactional behavior. I agree with Pavel's
position that the non-transactional behavior should be the default,
simply because it's better aligned with what the other databases are
doing (and supporting migrations seems like one of the main use cases
for this feature).

I do understand it may not be suitable for some other use cases,
mentioned by Fabien, but IMHO it's fine to require explicit
specification of transactional behavior. Well, we can't have both as
default, and I don't think there's an obvious reason why it should be
the other way around.

Now, a bunch of comments about the code (some of that nitpicking):


1) This hunk in doc/src/sgml/catalogs.sgml still talks about "table
creation" instead of schema creation:

      <row>
       <entry><structfield>vartypmod</structfield></entry>
       <entry><type>int4</type></entry>
       <entry></entry>
       <entry>
        <structfield>vartypmod</structfield> records type-specific data
        supplied at table creation time (for example, the maximum
        length of a <type>varchar</type> column).  It is passed to
        type-specific input functions and length coercion functions.
        The value will generally be -1 for types that do not need <structfield>vartypmod</structfield>.
       </entry>
      </row>


2) This hunk in doc/src/sgml/ref/alter_default_privileges.sgml uses
"role_name" instead of "variable_name"

     GRANT { READ | WRITE | ALL [ PRIVILEGES ] }
     ON VARIABLES
     TO { [ GROUP ] <replaceable class="parameter">role_name</replaceable> | PUBLIC } [, ...] [ WITH GRANT OPTION ]


3) I find the syntax in create_variable.sgml a bit too over-complicated:

<synopsis>
CREATE [ { TEMPORARY | TEMP } ] [ { TRANSACTIONAL | TRANSACTION } ] VARIABLE [ IF NOT EXISTS ] <replaceable class="parameter">name</replaceable> [ AS ] <replaceable class="parameter">data_type</replaceable> ] [ COLLATE <replaceable class="parameter">collation</replaceable> ]
     [ NOT NULL ] [ DEFAULT <replaceable class="parameter">default_expr</replaceable> ] [ { ON COMMIT DROP | ON { TRANSACTIONAL | TRANSACTION } END RESET } ]
</synopsis>

Do we really need both TRANSACTION and TRANSACTIONAL? Why not just one
that we already have in the grammar (i.e. TRANSACTION)?


4) I think we should rename schemavariable.h to schema_variable.h.


5) objectaddress.c has extra line after 'break;' in one switch.


6) The comment is wrong:

     /*
      * Find the ObjectAddress for a type or domain
      */
     static ObjectAddress
     get_object_address_variable(List *object, bool missing_ok)


7) I think the code/comments are really inconsistent in talking about
"variables" and "schema variables". For example in objectaddress.c we do
these two things:

         case OCLASS_VARIABLE:
             appendStringInfoString(&buffer, "schema variable");
             break;

vs.

         case DEFACLOBJ_VARIABLE:
             appendStringInfoString(&buffer,
                                    " on variables");
             break;

That's going to be confusing for people.


8) I'm rather confused by CMD_PLAN_UTILITY, which seems to be defined
merely to support LET. I'm not sure why that's necessary (Why wouldn't
CMD_UTILITY be sufficient?).

Having to add conditions checking for CMD_PLAN_UTILITY to various places
in planner.c is rather annoying, and I wonder how likely it's this will
unnecessarily break external code in extensions etc.


9) This comment in planner.c seems obsolete (not updated to reflect
addition of the CMD_PLAN_UTILITY check):

   /*
    * If this is an INSERT/UPDATE/DELETE, and we're not being called from
    * inheritance_planner, add the ModifyTable node.
    */
   if (parse->commandType != CMD_SELECT && parse->commandType != CMD_PLAN_UTILITY && !inheritance_update)


10) I kinda wonder what happens when a function is used in a WHERE
condition, but it depends on a variable and alsu mutates it on each
call ...


11) I think Query->hasSchemaVariable (in analyze.c) should be renamed to
hasSchemaVariables (which reflects the other fields referring to things
like window functions etc.)


12) I find it rather suspicious that we make decisions in utility.c
solely based on commandType (whether it's CMD_UTILITY or not). IMO
it's pretty strange/ugly that T_LetStmt can be both CMD_UTILITY and
CMD_PLAN_UTILITY:

     case T_LetStmt:
         {
             if (pstmt->commandType == CMD_UTILITY)
                 doLetStmtReset(pstmt);
             else
             {
                 Assert(pstmt->commandType == CMD_PLAN_UTILITY);
                 doLetStmtEval(pstmt, params, queryEnv, queryString);
             }

             if (completionTag)
                 strcpy(completionTag, "LET");
         }
         break;


13) Not sure why we moved DO_TABLE in addBoundaryDependencies
(pg_dump.c), seems unnecessary:

      case DO_CONVERSION:
-    case DO_TABLE:
+    case DO_VARIABLE:
      case DO_ATTRDEF:
+    case DO_TABLE:
      case DO_PROCLANG:


14) namespace.c defines VariableIsVisible twice:

     extern bool VariableIsVisible(Oid relid);
     ...
     extern bool VariableIsVisible(Oid varid);


15) I'd say lookup_variable and identify_variable should use camelcase
just like the other functions in the same file.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services