Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wQUos-001b2z-37 for pgsql-admin@arkaria.postgresql.org; Fri, 22 May 2026 18:42:03 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wQUoo-00E2UD-1v for pgsql-admin@arkaria.postgresql.org; Fri, 22 May 2026 18:41:59 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wQUoo-00E2U4-0S for pgsql-admin@lists.postgresql.org; Fri, 22 May 2026 18:41:59 +0000 Received: from mail-oi1-x234.google.com ([2607:f8b0:4864:20::234]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wQUom-00000000vlw-401K for pgsql-admin@lists.postgresql.org; Fri, 22 May 2026 18:41:58 +0000 Received: by mail-oi1-x234.google.com with SMTP id 5614622812f47-484ba8bc89eso5259052b6e.2 for ; Fri, 22 May 2026 11:41:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1779475315; cv=none; d=google.com; s=arc-20240605; b=DygWYiTQiDI4iUaK/nMErdDdA3EXjo/u5aYojMsIO77BIt5QmfTbnL6pSQFIVVFlo+ vzcYstb2dGU97gfSRIZjygBKQwdKl3Cw3izfFGJEjGmGqFNFnkrj7+UYFr5jJ+FAJqws /jbb4776d90cdHXe/CeQw1DLG4rew+qJXxJA1jZRjcFBEJyDkppYNm9srhlToq7KOgjA u7T5ohq6vEgOrrvxMwsAUtnHLOtI37INDHL7R6pDlRSKJ8uq+YcmrUB3AdoTanSlASZ0 4FaIBPLnvFBID39iVMsaG56Q4tVxo2r1X+kRuirCczc5OjimNJ6iGXm/ThvZGrYbbXxY O/vw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=xJf7ayJF3MJv4xxwpKHettw4WmVRmy/ljuvQNaTJ0dM=; fh=druxZHa2fk4e6MLibibygn9AWWgeaAPo4m8Gpo2MBXU=; b=K03Y1YqtI4qLr8eJxK2AcZPDN5TqWK7xhwiZnQ2ZIRby8ZX1VPVdLXHMz/jds+2HPs 6xd2iO2Sh4BBjg8remILivJnEJOm5RVYTjuGPPboMe66P6TdLQt0Af31isfTWK4qo756 tnLARgyzdQ5wX0twt6GqMo10f47GWCf+cO918kF+Qaqq+q+NRoCOJ0mbWhtzSKXjUX7a Nog3aVLoe7j86IZjoP0ZyfoD1jrRZFKsD8+SBVp69zHBza8TP5kI4SKPNyP5yIOoyJL/ CA80qLGCo9t6wXyfexseAEn55gdkJ9dVpCI6OgvRMp8Rqep9udZ5nFLKtCYJJFJ9fqME Xg/Q==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779475315; x=1780080115; darn=lists.postgresql.org; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=xJf7ayJF3MJv4xxwpKHettw4WmVRmy/ljuvQNaTJ0dM=; b=NlhZEKrJzJ6eBrk7DbO2opONq+wQtLSXxjoLWRa6LL88zXm1Fljm/mK5gxKibVbHgf 9v4CAqQp28F0+UygGlM2hsAaeizjBMds50P9+FplsjIcuG/avkAO5vDxp0dCVBUAWtni u/Az23WHmJdPaRbeTAGAUz6fNmBCOMIGk8ePahbnpZcET6s+l0UgJhoyhdPo8ovHW1pl qXfHy+S3I/8aGr39o5zpQ1f9NtOZKBxZ+6fgy3mWScsLl6eGQ7akt7dyutvLV1nrEN3s QfpWvR/hqBT5hwT4hX3P+BzdJoJlmaAxglOoRQxqSTaGUzUamHOJrsXcDyE/akdHOCJa 5JVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779475315; x=1780080115; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-gg:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=xJf7ayJF3MJv4xxwpKHettw4WmVRmy/ljuvQNaTJ0dM=; b=cO8HyHYpMFM/lqEYx2Z+qTa60OXQAkN8xTHIR17q4Jw8l6PFVY1GfZcJZPsqihwBwV hVg8ZbJ/KIzWnVKSvTlKXFDVaNV+5wexhJo1mmSQ7JAL5KxAEK/pYNXTQ/PikKarYi3r RcRht0N7NWzTL3xVg82EI26PyjUdl9nnZ0pAU8x9/MpTEblNM0Xu1vksWZO30ETF4xa/ /3bXNYzJic/xV60CcYqVtrWP9ipuRPCjFQjLsin7WkeVYom3qnjWWDj9F8MNHgiSJN9Q 3LV92p2aBNtGZYKvSOkEoP+/0mA5tbR1o5svNRSbQT5sD6ZoDQcyem//9PsMzUZUj5iC CHVg== X-Gm-Message-State: AOJu0YxW2S9vEHddRC5a102FS02kbV0pjNciuPD+Jk8ZTpWN+xb1I9/s xivA1zOvAlW9tylK2V+Gfr9xQP8LunfYADvSKhstFKdkR3AyxFvy3a0QduqLPSRzd4jIVGZJHRF NyLwXosGU6AgdsILMNMfdTRunw+L8ybReSQ== X-Gm-Gg: Acq92OHVtS/2oyqNp0lm8KSBecLBVMYyY7I+2viuUL6/Og4m4GDnmWDoUXBkExoycbQ W7xLcWT1Hahw5TeebzP0Gh/Q8DsXpLfw7RDut5Pfj6eNEFAkZnonAJpzkxNYo3IKoRLrM7us3p0 x7UZ/NUx89Q7pWX33iMxvywtRJJdDiY+J511G/pSn+djLGu0y+QyVATtZAkTvJeIUqk5o/pgYDb hzqKRLPien+JDL+Zlow4rZGg0GrvJi/ZlAVGumfr9tA1bdq+DZV/5854r0CAu17OSRNFOZB4pfS vGuPOtv+9GN7zJEhTc4VILmFxRfc4g== X-Received: by 2002:a05:6820:151c:b0:69d:6ce6:7437 with SMTP id 006d021491bc7-69d7eb64567mr2460477eaf.19.1779475314642; Fri, 22 May 2026 11:41:54 -0700 (PDT) MIME-Version: 1.0 References: <26493.1779468779@sss.pgh.pa.us> <7eb3ea21-5bf9-42bd-ac11-0fdc6f48866d@jakobs.com> In-Reply-To: <7eb3ea21-5bf9-42bd-ac11-0fdc6f48866d@jakobs.com> From: Ron Johnson Date: Fri, 22 May 2026 14:41:42 -0400 X-Gm-Features: AVHnY4JyaI67wqOprSC8xv81B36-7cZbGiD7w78iFXpATiDjsVSML8oI-bSyyws Message-ID: Subject: Re: Request For Feature: pg_dump To: Pgsql-admin Content-Type: multipart/alternative; boundary="00000000000004132506526c62ee" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --00000000000004132506526c62ee Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, May 22, 2026 at 2:09=E2=80=AFPM Holger Jakobs w= rote: > Am 22.05.26 um 19:20 schrieb Ron Johnson: > > On Fri, May 22, 2026 at 12:53=E2=80=AFPM Tom Lane wro= te: > >> Ron Johnson writes: >> > In --format=3Ddirectory mode, remove .dat files with zero data records= , >> and >> > mark that table's toc.dat entry that it's an empty table. >> >> > Justification: *lots* of empty tables means *lots* of teeny-tiny files >> in >> > the DB's dump directory. That unnecessarily bloats the fs, and makes >> "du >> > -c" really really slow. >> >> Evidence please? Most file systems that I've looked at optimize >> zero-size files pretty well. >> > > They aren't zero bytes. > It's those pesky 5 (or 14 or whatever size that gzip and lz4 produces) > byte files. 66 thousand tiny files plus 8 thousand files with data in th= em > makes for a 2.4MB directory. That's big and slow. > > $ find . -size 14c | wc > 66180 66180 1191240 > > $ zstd -dk 2115841.dat.zst > 2115841.dat.zst : 5 bytes > > $ cat 2115841.dat > \. > > $ dir | grep " 14 " | head -n20 > -rw-r--r-- 1 postgres postgres 14 2026-05-22 00:50:30 > 2115841.dat.zst > -rw-r--r-- 1 postgres postgres 14 2026-05-22 00:50:30 > 2115842.dat.zst > -rw-r--r-- 1 postgres postgres 14 2026-05-22 00:50:30 > 2115843.dat.zst > -rw-r--r-- 1 postgres postgres 14 2026-05-22 00:50:30 > 2115844.dat.zst > -rw-r--r-- 1 postgres postgres 14 2026-05-22 00:50:30 > 2115845.dat.zst > -rw-r--r-- 1 postgres postgres 14 2026-05-22 00:50:30 > 2115851.dat.zst > -rw-r--r-- 1 postgres postgres 14 2026-05-22 00:50:30 > 2115899.dat.zst > -rw-r--r-- 1 postgres postgres 14 2026-05-22 00:50:30 > 2115901.dat.zst > -rw-r--r-- 1 postgres postgres 14 2026-05-22 00:50:30 > 2115902.dat.zst > -rw-r--r-- 1 postgres postgres 14 2026-05-22 00:50:30 > 2115903.dat.zst > -rw-r--r-- 1 postgres postgres 14 2026-05-22 00:50:30 > 2115905.dat.zst > -rw-r--r-- 1 postgres postgres 14 2026-05-22 00:50:30 > 2115907.dat.zst > -rw-r--r-- 1 postgres postgres 14 2026-05-22 00:50:30 > 2115909.dat.zst > -rw-r--r-- 1 postgres postgres 14 2026-05-22 00:50:30 > 2115913.dat.zst > -rw-r--r-- 1 postgres postgres 14 2026-05-22 00:50:30 > 2115915.dat.zst > -rw-r--r-- 1 postgres postgres 14 2026-05-22 00:50:30 > 2115917.dat.zst > -rw-r--r-- 1 postgres postgres 14 2026-05-22 00:50:30 > 2115919.dat.zst > -rw-r--r-- 1 postgres postgres 14 2026-05-22 00:50:30 > 2115923.dat.zst > -rw-r--r-- 1 postgres postgres 14 2026-05-22 00:50:30 > 2115926.dat.zst > -rw-r--r-- 1 postgres postgres 14 2026-05-22 00:50:30 > 2115931.dat.zst > > -- > Death to , and butter sauce. > Don't boil me, I'm still alive. > lobster! > > Maybe just avoiding to compress empty files would already do the job. > The files aren't empty, though, since they have the terminating "\." > I think any file below a certain size isn't worth compressing. > --=20 Death to , and butter sauce. Don't boil me, I'm still alive. lobster! --00000000000004132506526c62ee Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Fri, May 22, 2026 at 2:09=E2=80=AFPM H= olger Jakobs <holger@jakobs.com= > wrote:
=20 =20 =20
Am 22.05.26 um 19:20 schrieb Ron Johnson:
=20
On Fri, May 22, 2026 at 12:53=E2=80=AFPM Tom Lane = <tgl@sss.pgh.pa.u= s> wrote:
Ron Johnson <ronljohnsonjr@gmail.com> writes:
> In --format=3Ddirectory mode, remove .dat files with zero data records, and
> mark that table's toc.dat entry that it's an empty table.

> Justification: *lots* of empty tables means *lots* of teeny-tiny files in
> the DB's dump directory.=C2=A0 That unnecessarily bloa= ts the fs, and makes "du
> -c" really really slow.

Evidence please?=C2=A0 Most file systems that I've looked a= t optimize
zero-size files pretty well.

They aren't zero bytes.
It's those pesky 5 (or 14 or whatever size that gzip and lz4 produces) byte files.=C2=A0 66 thousand tiny files plus 8 thousand files with data in them makes for a 2.4MB directory.=C2=A0 That's big and slow.

$ find . -size 14c | wc
=C2=A0 66180 =C2=A0 66180 1191240

$ zstd -dk 2115841.dat.zst
2115841.dat.zst =C2=A0 =C2=A0 : 5 bytes=C2=A0 =C2=A0 =C2=A0= =C2=A0

$ cat 2115841.dat
\.

$ dir | grep " 14 " | hea= d -n20
-rw-r--r-- 1 postgres postgres =C2=A0 =C2=A0 =C2=A0 =C2=A0 14= 2026-05-22 00:50:30 2115841.dat.zst
-rw-r--r-- 1 postgres postgres =C2=A0 =C2=A0 =C2=A0 =C2=A0 14= 2026-05-22 00:50:30 2115842.dat.zst
-rw-r--r-- 1 postgres postgres =C2=A0 =C2=A0 =C2=A0 =C2=A0 14= 2026-05-22 00:50:30 2115843.dat.zst
-rw-r--r-- 1 postgres postgres =C2=A0 =C2=A0 =C2=A0 =C2=A0 14= 2026-05-22 00:50:30 2115844.dat.zst
-rw-r--r-- 1 postgres postgres =C2=A0 =C2=A0 =C2=A0 =C2=A0 14= 2026-05-22 00:50:30 2115845.dat.zst
-rw-r--r-- 1 postgres postgres =C2=A0 =C2=A0 =C2=A0 =C2=A0 14= 2026-05-22 00:50:30 2115851.dat.zst
-rw-r--r-- 1 postgres postgres =C2=A0 =C2=A0 =C2=A0 =C2=A0 14= 2026-05-22 00:50:30 2115899.dat.zst
-rw-r--r-- 1 postgres postgres =C2=A0 =C2=A0 =C2=A0 =C2=A0 14= 2026-05-22 00:50:30 2115901.dat.zst
-rw-r--r-- 1 postgres postgres =C2=A0 =C2=A0 =C2=A0 =C2=A0 14= 2026-05-22 00:50:30 2115902.dat.zst
-rw-r--r-- 1 postgres postgres =C2=A0 =C2=A0 =C2=A0 =C2=A0 14= 2026-05-22 00:50:30 2115903.dat.zst
-rw-r--r-- 1 postgres postgres =C2=A0 =C2=A0 =C2=A0 =C2=A0 14= 2026-05-22 00:50:30 2115905.dat.zst
-rw-r--r-- 1 postgres postgres =C2=A0 =C2=A0 =C2=A0 =C2=A0 14= 2026-05-22 00:50:30 2115907.dat.zst
-rw-r--r-- 1 postgres postgres =C2=A0 =C2=A0 =C2=A0 =C2=A0 14= 2026-05-22 00:50:30 2115909.dat.zst
-rw-r--r-- 1 postgres postgres =C2=A0 =C2=A0 =C2=A0 =C2=A0 14= 2026-05-22 00:50:30 2115913.dat.zst
-rw-r--r-- 1 postgres postgres =C2=A0 =C2=A0 =C2=A0 =C2=A0 14= 2026-05-22 00:50:30 2115915.dat.zst
-rw-r--r-- 1 postgres postgres =C2=A0 =C2=A0 =C2=A0 =C2=A0 14= 2026-05-22 00:50:30 2115917.dat.zst
-rw-r--r-- 1 postgres postgres =C2=A0 =C2=A0 =C2=A0 =C2=A0 14= 2026-05-22 00:50:30 2115919.dat.zst
-rw-r--r-- 1 postgres postgres =C2=A0 =C2=A0 =C2=A0 =C2=A0 14= 2026-05-22 00:50:30 2115923.dat.zst
-rw-r--r-- 1 postgres postgres =C2=A0 =C2=A0 =C2=A0 =C2=A0 14= 2026-05-22 00:50:30 2115926.dat.zst
-rw-r--r-- 1 postgres postgres =C2=A0 =C2=A0 =C2=A0 =C2=A0 14= 2026-05-22 00:50:30 2115931.dat.zst
=C2=A0
--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!

Maybe just avoiding to compress empty files would already do the job.


The files aren't e= mpty, though, since they have the terminating "\."
=C2= =A0

I think= any file below a certain size isn't worth compressing.


--
Death to <Redacted>, and butter sauce.Don't boil me, I'm still alive.
<Redacted> lobs= ter!
--00000000000004132506526c62ee--