public inbox for pgsql-performance@postgresql.org
help / color / mirror / Atom feedFrom: Justin Pryzby <pryzby@telsasoft.com>
To: Samir Magar <samirmagar8@gmail.com>
Cc: pgsql-performance@postgresql.org
Cc: Pavel Stehule <pavel.stehule@gmail.com>
Subject: Re: query performance issue
Date: Wed, 15 Nov 2017 08:16:21 -0600
Message-ID: <20171115141621.GV2167@telsasoft.com> (raw)
In-Reply-To: <CAA=to3gnNReZ62m3K3qbN_KL8Hnmrg4wfb4LefZYqUa9v4OOSQ@mail.gmail.com> <CAA=to3gXpr94TqSAYXT9zwxYwY_xKrLFnu8ZJNgJKP-kWza2CA@mail.gmail.com>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-performance>
On Wed, Nov 15, 2017 at 03:03:39PM +0530, Samir Magar wrote:
> I am having performance issues with one of the query.
> The query is taking 39 min to fetch 3.5 mil records.
>
> I want to reduce that time to 15 mins.
> could you please suggest something to its performance?
> "HashAggregate (cost=4459.68..4459.69 rows=1 width=27) (actual time=2890035.403..2892173.601 rows=3489861 loops=1)"
Looks to me like the problem is here:
> " -> Index Only Scan using idxdq7 on dlr_qlfy (cost=0.43..4.45 ROWS=1 width=16) (actual time=0.009..0.066 ROWS=121 loops=103987)"
> " Index Cond: ((qlfy_grp_id = dlr_grp.dlr_grp_id) AND (qlf_flg = 'N'::bpchar) AND (cog_grp_id = dlr_grp_dlr_xref_1.dlr_grp_id))"
> " Heap Fetches: 0"
Returning 100x more rows than expected and bubbling up through a cascade of
nested loops.
Are those 3 conditions independent ? Or, perhaps, are rows for which
"qlfy_grp_id=dlr_grp.dlr_grp_id" is true always going to have
"cog_grp_id = dlr_grp_dlr_xref_1.dlr_grp_id" ?
Even if it's not "always" true, if rows which pass the one condition are more
likely to pass the other condition, this will cause an underestimate, as
obvserved.
You can do an experiment SELECTing just from those two tables joined and see if
you can reproduce the problem with poor rowcount estimate (hopefully in much
less than 15min).
If you can't drop one of the two conditions, you can make PG treat it as a
single condition for purpose of determining expected selectivity, using a ROW()
comparison like:
ROW(qlfy_grp_id, cog_grp_id) = ROW(dlr_grp.dlr_grp_id, dlr_grp_dlr_xref_1.dlr_grp_id)
If you're running PG96+ you may also be able to work around this by adding FKs.
Justin
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: pgsql-performance@postgresql.org
Cc: pryzby@telsasoft.com, samirmagar8@gmail.com, pavel.stehule@gmail.com
Subject: Re: query performance issue
In-Reply-To: <20171115141621.GV2167@telsasoft.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox