Re: Request For Feature: pg_dump

public inbox for pgsql-admin@postgresql.org  
help / color / mirror / Atom feed

From: Holger Jakobs <holger@jakobs.com>
To: pgsql-admin@lists.postgresql.org
Subject: Re: Request For Feature: pg_dump
Date: Fri, 22 May 2026 20:09:32 +0200
Message-ID: <7eb3ea21-5bf9-42bd-ac11-0fdc6f48866d@jakobs.com> (raw)
In-Reply-To: <CANzqJaCJGS9S=ibJ_rOmP=acA7n0XCA5R4FbMH+AH3WURhey8Q@mail.gmail.com>
References: <CANzqJaCNaqhQ9duPP+eKU2OR1wjGYW-FBQ2FZXqnBRb6XAnKQA@mail.gmail.com>
	<26493.1779468779@sss.pgh.pa.us>
	<CANzqJaCJGS9S=ibJ_rOmP=acA7n0XCA5R4FbMH+AH3WURhey8Q@mail.gmail.com>

Am 22.05.26 um 19:20 schrieb Ron Johnson:
> On Fri, May 22, 2026 at 12:53 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>     Ron Johnson <ronljohnsonjr@gmail.com> writes:
>     > In --format=directory mode, remove .dat files with zero data
>     records, and
>     > mark that table's toc.dat entry that it's an empty table.
>
>     > Justification: *lots* of empty tables means *lots* of teeny-tiny
>     files in
>     > the DB's dump directory.  That unnecessarily bloats the fs, and
>     makes "du
>     > -c" really really slow.
>
>     Evidence please?  Most file systems that I've looked at optimize
>     zero-size files pretty well.
>
>
> They aren't zero bytes.
> It's those pesky 5 (or 14 or whatever size that gzip and lz4 produces) 
> byte files.  66 thousand tiny files plus 8 thousand files with data in 
> them makes for a 2.4MB directory.  That's big and slow.
>
> $ find . -size 14c | wc
>   66180   66180 1191240
>
> $ zstd -dk 2115841.dat.zst
> 2115841.dat.zst     : 5 bytes
>
> $ cat 2115841.dat
> \.
>
> $ dir | grep " 14 " | head -n20
> -rw-r--r-- 1 postgres postgres         14 2026-05-22 00:50:30 
> 2115841.dat.zst
> -rw-r--r-- 1 postgres postgres         14 2026-05-22 00:50:30 
> 2115842.dat.zst
> -rw-r--r-- 1 postgres postgres         14 2026-05-22 00:50:30 
> 2115843.dat.zst
> -rw-r--r-- 1 postgres postgres         14 2026-05-22 00:50:30 
> 2115844.dat.zst
> -rw-r--r-- 1 postgres postgres         14 2026-05-22 00:50:30 
> 2115845.dat.zst
> -rw-r--r-- 1 postgres postgres         14 2026-05-22 00:50:30 
> 2115851.dat.zst
> -rw-r--r-- 1 postgres postgres         14 2026-05-22 00:50:30 
> 2115899.dat.zst
> -rw-r--r-- 1 postgres postgres         14 2026-05-22 00:50:30 
> 2115901.dat.zst
> -rw-r--r-- 1 postgres postgres         14 2026-05-22 00:50:30 
> 2115902.dat.zst
> -rw-r--r-- 1 postgres postgres         14 2026-05-22 00:50:30 
> 2115903.dat.zst
> -rw-r--r-- 1 postgres postgres         14 2026-05-22 00:50:30 
> 2115905.dat.zst
> -rw-r--r-- 1 postgres postgres         14 2026-05-22 00:50:30 
> 2115907.dat.zst
> -rw-r--r-- 1 postgres postgres         14 2026-05-22 00:50:30 
> 2115909.dat.zst
> -rw-r--r-- 1 postgres postgres         14 2026-05-22 00:50:30 
> 2115913.dat.zst
> -rw-r--r-- 1 postgres postgres         14 2026-05-22 00:50:30 
> 2115915.dat.zst
> -rw-r--r-- 1 postgres postgres         14 2026-05-22 00:50:30 
> 2115917.dat.zst
> -rw-r--r-- 1 postgres postgres         14 2026-05-22 00:50:30 
> 2115919.dat.zst
> -rw-r--r-- 1 postgres postgres         14 2026-05-22 00:50:30 
> 2115923.dat.zst
> -rw-r--r-- 1 postgres postgres         14 2026-05-22 00:50:30 
> 2115926.dat.zst
> -rw-r--r-- 1 postgres postgres         14 2026-05-22 00:50:30 
> 2115931.dat.zst
> -- 
> Death to <Redacted>, and butter sauce.
> Don't boil me, I'm still alive.
> <Redacted> lobster!

Maybe just avoiding to compress empty files would already do the job. I 
think any file below a certain size isn't worth compressing.

Regards,

Holger

-- 

Holger Jakobs

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: pgsql-admin@postgresql.org
  Cc: holger@jakobs.com, pgsql-admin@lists.postgresql.org
  Subject: Re: Request For Feature: pg_dump
  In-Reply-To: <7eb3ea21-5bf9-42bd-ac11-0fdc6f48866d@jakobs.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox