public inbox for pgsql-hackers@postgresql.org  
help / color / mirror / Atom feed
From: Jelte Fennema-Nio <postgres@jeltef.nl>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Andres Freund <andres@anarazel.de>
Cc: torikoshia <torikoshia@oss.nttdata.com>
Cc: pgsql-hackers@postgresql.org
Cc: rjuju123@gmail.com
Cc: Bruce Momjian <bruce@momjian.us>
Subject: Re: RFC: Allow EXPLAIN to Output Page Fault Information
Date: Sun, 9 Feb 2025 21:06:02 +0100
Message-ID: <CAGECzQTm6oEEY4yO_FO0ZBtUuJX+pYiXhw2GPPRMzq_5DP5_fQ@mail.gmail.com> (raw)
In-Reply-To: <2035079.1739124342@sss.pgh.pa.us>
References: <D6MJOHS7HZ80.3B17NDGUV6T47@jeltef.nl>
	<Z3LII9YEgp3kAZG-@momjian.us>
	<3861403.1735576764@sss.pgh.pa.us>
	<D6PENGW1PL79.266GQNPSMRIGA@jeltef.nl>
	<1f22794321b745549d54359d343e37b8@oss.nttdata.com>
	<D6V9GJ2LOLC5.1ZF3BWKKEBZWC@jeltef.nl>
	<CAM6-o=BE=oewSsdNKrbbNjR2muzQR49STXsuta+Wpq6CSyxTag@mail.gmail.com>
	<2c9d6eaf26df17bec13bb03bf1e9bcbb@oss.nttdata.com>
	<CAGECzQRvcLx44N3zd_DGCjY02XX4AqXX8mq4BiS8C9Froy+Jhg@mail.gmail.com>
	<CAGECzQQdc-k=M2fMCKa98kVZntc=6d3rpd6edt8Qs45cayfUeQ@mail.gmail.com>
	<myw76agwrlmisvamwbfl6ibxgwh5glzitydiwnfmtb5aui232i@274yxxtnbnsp>
	<2035079.1739124342@sss.pgh.pa.us>

On Sun, 9 Feb 2025 at 19:05, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Andres Freund <andres@anarazel.de> writes:
> > I'm somewhat against this patch, as it's fairly fundamentally incompatible
> > with AIO. There's no real way to get information in this manner if the IO
> > isn't executed synchronously in process context...

Hmm, I had not considered how this would interact with your AIO work.
I agree that getting this info would be hard/impossible to do
efficiently, when IOs are done by background IO processes that
interleave IOs from different queries. But I'd expect that AIOs that
are done using iouring would be tracked correctly without having to
change this code at all (because I assume those are done from the
query backend process).

One other thought: I think the primary benefit of this feature is
being able to see how many read IOs actually hit the disk, as opposed
to hitting OS page cache. That benefit disappears when using Direct
IO, because then there's no OS page cache.

How many years away do you think that widespread general use of
AIO+Direct IO is, though? I think that for the N years from now until
then, it would be very nice to have this feature to help debug query
performance problems. Then once the numbers become too
inaccurate/useless at some point, we could simply remove them again.

> Even without looking ahead to AIO, there's bgwriter, walwriter, and
> checkpointer processes that all take I/O load away from foreground
> processes.  I don't really believe that this will produce useful
> numbers.

The bgwriter, walwriter, and checkpointer should only take away
*write* IOs. For read IOs the numbers should be very accurate and as
explained above read IOs is where I think the primary benefit of this
feature is.

But even for write IOs I think the numbers would be useful when
looking at them with the goal of finding out why a particular query is
slow: If the bgwriter or checkpointer do the writes, then the query
should be roughly as fast as if no writes to the disk had taken place
at all, but if the query process does the writes then those writes are
probably blocking further execution of the query and thus slowing it
down.






reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: pgsql-hackers@postgresql.org
  Cc: postgres@jeltef.nl, tgl@sss.pgh.pa.us, andres@anarazel.de, torikoshia@oss.nttdata.com, rjuju123@gmail.com, bruce@momjian.us
  Subject: Re: RFC: Allow EXPLAIN to Output Page Fault Information
  In-Reply-To: <CAGECzQTm6oEEY4yO_FO0ZBtUuJX+pYiXhw2GPPRMzq_5DP5_fQ@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox