public inbox for pgsql-bugs@postgresql.org  
help / color / mirror / Atom feed
BUG #19483: pg_upgrade fails with orphan records in pg_init_priv catalog table
2+ messages / 2 participants
[nested] [flat]

* BUG #19483: pg_upgrade fails with orphan records in pg_init_priv catalog table
@ 2026-05-18 05:06 PG Bug reporting form <noreply@postgresql.org>
  2026-05-20 13:06 ` Re: BUG #19483: pg_upgrade fails with orphan records in pg_init_priv catalog table Greg Sabino Mullane <htamfids@gmail.com>
  0 siblings, 1 reply; 2+ messages in thread

From: PG Bug reporting form @ 2026-05-18 05:06 UTC (permalink / raw)
  To: pgsql-bugs@lists.postgresql.org; +Cc: huseyin.d3r@gmail.com

The following bug has been logged on the website:

Bug reference:      19483
Logged by:          Hüseyin Demir
Email address:      huseyin.d3r@gmail.com
PostgreSQL version: 18.3
Operating system:   CentOS 9
Description:        

Hi,

I encountered a problem during the pg_upgrade.

pg_upgrade --check passes all consistency checks but then fails during
the actual upgrade when pg_restore attempts to GRANT privileges to a
role OID (e.g. "16424") that no longer exists. The root cause is orphan
records in pg_init_privs whose ACL entries reference a dropped role by
OID. pg_upgrade does not detect these dangling references during its
pre-flight checks.

Steps to Reproduce

1. Create a role and database with that role as owner:

   CREATE ROLE benchmark_owner SUPERUSER;
   CREATE DATABASE my_db OWNER benchmark_owner;

2. Connect to the database and install pg_wait_sampling as that role:

   \c my_db
   SET ROLE benchmark_owner;
   CREATE EXTENSION pg_wait_sampling;
   RESET ROLE;

3. Verify pg_init_privs now contains rows with benchmark_owner in the
   ACL (privtype = 'e', OIDs 16429, 16434, 16439, 16443, etc.)

4. Reassign owned objects and drop the role:

   \c postgres
   SELECT pg_terminate_backend(pid)
     FROM pg_stat_activity WHERE datname = 'my_db';
   ALTER DATABASE my_db RENAME TO my_db_v2;
   \c my_db_v2
   REASSIGN OWNED BY benchmark_owner TO postgres;
   DROP OWNED BY benchmark_owner;
   \c postgres
   DROP ROLE benchmark_owner;

5. Verify orphan records remain in pg_init_privs:

   SELECT pip.objoid
   FROM pg_init_privs pip
   CROSS JOIN LATERAL aclexplode(pip.initprivs) ace
   LEFT JOIN pg_authid a ON a.oid = ace.grantee
   WHERE a.oid IS NULL AND ace.grantee <> 0;

   -- Returns 22 rows referencing OID 16461 (the dropped role)

6. Run pg_upgrade --check:

   /usr/pgsql-18/bin/pg_upgrade \
     --old-datadir=/var/lib/pgsql/14/data \
     --new-datadir=/var/lib/pgsql/18/data \
     --old-bindir=/usr/pgsql-14/bin/ \
     --new-bindir=/usr/pgsql-18/bin/ \
     --check --jobs=4

   Result: "Clusters are compatible" -- all checks pass.

[postgres@localhost data]$ /usr/pgsql-18/bin/pg_upgrade
--old-datadir=/var/lib/pgsql/14/data --new-datadir=/var/lib/pgsql/18/data
--old-bindir=/usr/pgsql-14/bin/ --new-bindir=/usr/pgsql-18/bin/ --check
--jobs=4
Performing Consistency Checks on Old Live Server
------------------------------------------------
Checking cluster versions                                     ok
Checking database connection settings                         ok
Checking database user is the install user                    ok
Checking for prepared transactions                            ok
Checking for contrib/isn with bigint-passing mismatch         ok
Checking data type usage                                      ok
Checking for not-null constraint inconsistencies              ok
Checking for presence of required libraries                   ok
Checking database user is the install user                    ok
Checking for prepared transactions                            ok
Checking for new cluster tablespace directories               ok

*Clusters are compatible*

7. Run pg_upgrade (without --check):

[postgres@localhost data]$ /usr/pgsql-18/bin/pg_upgrade
--old-datadir=/var/lib/pgsql/14/data --new-datadir=/var/lib/pgsql/18/data
--old-bindir=/usr/pgsql-14/bin/ --new-bindir=/usr/pgsql-18/bin/ --jobs=4
Performing Consistency Checks
-----------------------------
Checking cluster versions                                     ok
Checking database connection settings                         ok
Checking database user is the install user                    ok
Checking for prepared transactions                            ok
Checking for contrib/isn with bigint-passing mismatch         ok
Checking data type usage                                      ok
Checking for not-null constraint inconsistencies              ok
Creating dump of global objects                               ok
Creating dump of database schemas
                                                              ok
Checking for presence of required libraries                   ok
Checking database user is the install user                    ok
Checking for prepared transactions                            ok
Checking for new cluster tablespace directories               ok

If pg_upgrade fails after this point, you must re-initdb the
new cluster before continuing.

Performing Upgrade
------------------
Setting locale and encoding for new cluster                   ok
Analyzing all rows in the new cluster                         ok
Freezing all rows in the new cluster                          ok
Deleting files from new pg_xact                               ok
Copying old pg_xact to new server                             ok
Setting oldest XID for new cluster                            ok
Setting next transaction ID and epoch for new cluster         ok
Deleting files from new pg_multixact/offsets                  ok
Copying old pg_multixact/offsets to new server                ok
Deleting files from new pg_multixact/members                  ok
Copying old pg_multixact/members to new server                ok
Setting next multixact ID and offset for new cluster          ok
Resetting WAL archives                                        ok
Setting the default char signedness for new cluster           ok
Setting frozenxid and minmxid counters in new cluster         ok
Restoring global objects in the new cluster                   ok
Restoring database schemas in the new cluster
  my_db_v2
*failure*

Consult the last few lines of
"/var/lib/pgsql/18/data/pg_upgrade_output.d/20260518T065340.119/log/pg_upgrade_dump_16425.log"
for
the probable cause of the failure.
Failure, exiting

*failure*

Consult the last few lines of
"/var/lib/pgsql/18/data/pg_upgrade_output.d/20260518T065340.119/log/pg_upgrade_dump_16384.log"
for
the probable cause of the failure.
Failure, exiting

child process exited abnormally: status 256
Failure, exiting
[postgres@localhost data]$

The pg_upgrade log as follow
[postgres@localhost data]$ cat
/var/lib/pgsql/18/data/pg_upgrade_output.d/20260518T065340.119/log/pg_upgrade_dump_16425.log
command: "/usr/pgsql-18/bin/pg_dump" --host /var/lib/pgsql/18/data --port
50432 --username postgres --no-data --sequence-data  --quote-all-identifiers
--binary-upgrade --format=custom --statistics --no-sync
--file="/var/lib/pgsql/18/data/pg_upgrade_output.d/20260518T065340.119/dump/pg_upgrade_dump_16425.custom"
'dbname=my_db_v2' >>
"/var/lib/pgsql/18/data/pg_upgrade_output.d/20260518T065340.119/log/pg_upgrade_dump_16425.log"
2>&1


command: "/usr/pgsql-18/bin/pg_restore" --host /var/lib/pgsql/18/data --port
50432 --username postgres --create --exit-on-error --verbose
--transaction-size=250 --dbname template1
"/var/lib/pgsql/18/data/pg_upgrade_output.d/20260518T065340.119/dump/pg_upgrade_dump_16425.custom"
>>
"/var/lib/pgsql/18/data/pg_upgrade_output.d/20260518T065340.119/log/pg_upgrade_dump_16425.log"
2>&1
pg_restore: connecting to database for restore
pg_restore: executing SELECT pg_catalog.set_config('search_path', '',
false);
pg_restore: creating DATABASE "my_db_v2"
pg_restore: connecting to new database "my_db_v2"
pg_restore: executing SELECT pg_catalog.set_config('search_path', '',
false);
pg_restore: creating DATABASE PROPERTIES "my_db_v2"
pg_restore: connecting to new database "my_db_v2"
pg_restore: executing SELECT pg_catalog.set_config('search_path', '',
false);
pg_restore: creating pg_largeobject "pg_largeobject"
pg_restore: creating SCHEMA "public"
pg_restore: creating COMMENT "SCHEMA "public""
pg_restore: creating EXTENSION "pg_wait_sampling"
pg_restore: creating COMMENT "EXTENSION "pg_wait_sampling""
pg_restore: creating FUNCTION "public.pg_wait_sampling_get_current(integer)"
pg_restore: creating FUNCTION "public.pg_wait_sampling_get_history()"
pg_restore: creating FUNCTION "public.pg_wait_sampling_get_profile()"
pg_restore: creating FUNCTION "public.pg_wait_sampling_reset_profile()"
pg_restore: creating VIEW "public.pg_wait_sampling_current"
pg_restore: creating VIEW "public.pg_wait_sampling_history"
pg_restore: creating VIEW "public.pg_wait_sampling_profile"
pg_restore: creating ACL "SCHEMA "public""
pg_restore: creating ACL "public.FUNCTION
"pg_wait_sampling_reset_profile"()"
pg_restore: while PROCESSING TOC:
pg_restore: from TOC entry 4338; 0 0 ACL FUNCTION
"pg_wait_sampling_reset_profile"() postgres
pg_restore: error: could not execute query: ERROR:  role "16424" does not
exist
Command was: SELECT pg_catalog.binary_upgrade_set_record_init_privs(true);
REVOKE ALL ON FUNCTION "public"."pg_wait_sampling_reset_profile"() FROM
PUBLIC;
REVOKE ALL ON FUNCTION "public"."pg_wait_sampling_reset_profile"() FROM
"postgres";
SET SESSION AUTHORIZATION "16424";
GRANT ALL ON FUNCTION "public"."pg_wait_sampling_reset_profile"() TO
"16424";
RESET SESSION AUTHORIZATION;
SELECT pg_catalog.binary_upgrade_set_record_init_privs(false);
REVOKE ALL ON FUNCTION "public"."pg_wait_sampling_reset_profile"() FROM
"16424";
GRANT ALL ON FUNCTION "public"."pg_wait_sampling_reset_profile"() TO
"postgres";

Root Cause

When a role is dropped, DROP OWNED BY and REASSIGN OWNED BY do NOT
cascade to pg_init_privs. This leaves ACL entries in pg_init_privs
referencing the dropped role by its numeric OID.

pg_dump (run internally by pg_upgrade) faithfully dumps these orphan
ACL entries and emits SET SESSION AUTHORIZATION "<OID>" commands.
pg_restore then fails because the numeric OID is not a valid role name
in the new cluster.

pg_upgrade's --check phase does not include a check for orphan role
references inside pg_init_privs ACLs, so the problem is not caught
before the destructive upgrade steps begin.

Expected Behavior

pg_upgrade --check should detect orphan ACL entries in pg_init_privs
(where ace.grantee does not match any pg_authid.oid and is not 0) and
report an error before declaring the clusters compatible.

Possible Fix

Add a new pre-flight check in pg_upgrade (src/bin/pg_upgrade/check.c)
that runs the following query against each user database in the old
cluster:

   SELECT DISTINCT pip.objoid
   FROM pg_init_privs pip
   CROSS JOIN LATERAL aclexplode(pip.initprivs) ace
   LEFT JOIN pg_authid a ON a.oid = ace.grantee
   WHERE a.oid IS NULL AND ace.grantee <> 0;

If any rows are returned, pg_upgrade should abort with a descriptive
error message telling the user to either drop or clean up the orphan
ACL entries before upgrading.

Workaround

Before running pg_upgrade, manually clean up orphan pg_init_privs
entries in all user databases:

   -- Identify affected objects
   SELECT pip.objoid
   FROM pg_init_privs pip
   CROSS JOIN LATERAL aclexplode(pip.initprivs) ace
   LEFT JOIN pg_authid a ON a.oid = ace.grantee
   WHERE a.oid IS NULL AND ace.grantee <> 0;

   -- Remove the orphan rows (adjust WHERE clause as needed)
   DELETE FROM pg_init_privs
   WHERE objoid IN (
     SELECT pip.objoid
     FROM pg_init_privs pip
     CROSS JOIN LATERAL aclexplode(pip.initprivs) ace
     LEFT JOIN pg_authid a ON a.oid = ace.grantee
     WHERE a.oid IS NULL AND ace.grantee <> 0
   );

Note: Direct modification of system catalogs requires
allow_system_table_mods = on.


How should we proceed or solve this case ?








^ permalink  raw  reply  [nested|flat] 2+ messages in thread

* Re: BUG #19483: pg_upgrade fails with orphan records in pg_init_priv catalog table
  2026-05-18 05:06 BUG #19483: pg_upgrade fails with orphan records in pg_init_priv catalog table PG Bug reporting form <noreply@postgresql.org>
@ 2026-05-20 13:06 ` Greg Sabino Mullane <htamfids@gmail.com>
  0 siblings, 0 replies; 2+ messages in thread

From: Greg Sabino Mullane @ 2026-05-20 13:06 UTC (permalink / raw)
  To: huseyin.d3r@gmail.com; pgsql-bugs@lists.postgresql.org

>
> PostgreSQL version: 18.3
> ...

5. Verify orphan records remain in pg_init_privs:erprise Postgres Software
> Products & Tech Support
>

Thanks for providing a failing use case. I ran this on a 18.3 server and
found no orphaned rows - but I used the pg_stat_statements extension
instead of pg_wait_sampling. Could you try your experiment using
pg_stat_statements? And could you also show us the contents of the errant
rows in pg_init_privs for the failing case?


^ permalink  raw  reply  [nested|flat] 2+ messages in thread


end of thread, other threads:[~2026-05-20 13:06 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-05-18 05:06 BUG #19483: pg_upgrade fails with orphan records in pg_init_priv catalog table PG Bug reporting form <noreply@postgresql.org>
2026-05-20 13:06 ` Greg Sabino Mullane <htamfids@gmail.com>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox