Re: RFC: Allow EXPLAIN to Output Page Fault Information

public inbox for pgsql-hackers@postgresql.org  
help / color / mirror / Atom feed

From: Jelte Fennema-Nio <postgres@jeltef.nl>
To: Tom Lane <tgl@sss.pgh.pa.us>
To: Bruce Momjian <bruce@momjian.us>
Cc: torikoshia <torikoshia@oss.nttdata.com>
Cc: Pgsql Hackers <pgsql-hackers@postgresql.org>
Subject: Re: RFC: Allow EXPLAIN to Output Page Fault Information
Date: Mon, 30 Dec 2024 23:57:14 +0100
Message-ID: <D6PENGW1PL79.266GQNPSMRIGA@jeltef.nl> (raw)
In-Reply-To: <3861403.1735576764@sss.pgh.pa.us>
References: <c20f6340eb26f3b736abc59471bfada8@oss.nttdata.com>
	<1614321.1735055528@sss.pgh.pa.us>
	<D6MJOHS7HZ80.3B17NDGUV6T47@jeltef.nl>
	<Z3LII9YEgp3kAZG-@momjian.us>
	<3861403.1735576764@sss.pgh.pa.us>

On Mon Dec 30, 2024 at 5:39 PM CET, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
>> I certainly would love to see storage I/O numbers as distinct from
>> kernel read I/O numbers.
>
> Me too, but I think it is 100% wishful thinking to imagine that
> page fault counts match up with that.  

Okay I played around with this patch a bit, in hopes of proving you
wrong. But I now agree with you. I cannot seem to get any numbers out of
this that make sense.

The major page fault numbers are always zero, even after running:

echo 1 > /proc/sys/vm/drop_caches

If Takahori has a way to get some more useful insights from this patch,
I'm quite interested in the steps he took (I might very well have missed
something obvious).

**However, I think the general direction has merit**: Changing this patch to
use `ru_inblock`/`ru_oublock` gives very useful insights. `ru_inblock`
is 0 when everything is in page cache, and it is very high when stuff is
not. I was only hacking around and basically did this:

s/ru_minflt/ru_inblock/g
s/ru_majflt/ru_oublock/g

Obviously more is needed. We'd probably want to show these numbers in
useful units like MB or something. Also, maybe there's some better way
of getting read/write numbers for the current process than
ru_inblock/ru_oublock (but this one seems to work at least reasonably
well).

One other thing that I noticed when playing around with this, which
would need to be addressed: Parallel workers need to pass these values
to the main process somehow, otherwise the IO from those processes gets lost.

For the record, the queries I used to test this patch were:

create table t_big(a int, b text);
insert into t_big SELECT i, repeat(i::text, 200) FROM generate_series(1, 3000000) i;
explain (ANALYZE, PAGEFAULTS) select max(a), max(b) from t_big;
explain (analyze, PAGEFAULTS) insert into t_big SELECT i, repeat(i::text, 200) FROM generate_series(1, 3000000) i;

And then seeing if there was any difference in the explain analyze
output after running the following (as root):

echo 1 > /proc/sys/vm/drop_caches

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: pgsql-hackers@postgresql.org
  Cc: postgres@jeltef.nl, tgl@sss.pgh.pa.us, bruce@momjian.us, torikoshia@oss.nttdata.com
  Subject: Re: RFC: Allow EXPLAIN to Output Page Fault Information
  In-Reply-To: <D6PENGW1PL79.266GQNPSMRIGA@jeltef.nl>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox