Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eF49l-0007SV-4q for pgsql-performance@arkaria.postgresql.org; Wed, 15 Nov 2017 20:19:33 +0000 Received: from localhost ([127.0.0.1] helo=postgresql.org) by malur.postgresql.org with smtp (Exim 4.84_2) (envelope-from ) id 1eF49k-0005MN-EP for pgsql-performance@arkaria.postgresql.org; Wed, 15 Nov 2017 20:19:32 +0000 Received: from makus.postgresql.org ([2001:4800:1501:1::229]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1eF49Z-00057Q-Ez for pgsql-performance@postgresql.org; Wed, 15 Nov 2017 20:19:21 +0000 Received: from mail-io0-x22b.google.com ([2607:f8b0:4001:c06::22b]) by makus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1eF49W-00013v-5E for pgsql-performance@postgresql.org; Wed, 15 Nov 2017 20:19:19 +0000 Received: by mail-io0-x22b.google.com with SMTP id v21so2877676ioi.4 for ; Wed, 15 Nov 2017 12:19:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=telsasoft-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:subject:message-id:mime-version:content-disposition :user-agent; bh=5F8svIh0tWRDWMBHSIfnhdrV8hQWNn2ZUaLRSZBKst8=; b=K80m8SC7CQpm5upnSrEfhT88ToOewGmmQBu2irbPB5yVirQSjhWFBKqJF7k9LAj6bx wXqiTMpXHD+IWw+eIxjNxoJQbkOVDRtFhvo2X/Hh/C70rMLPKQ8RYL/5tNWriFpjpZ7r TyrNfYuCDUic5j9bTRoaRwdddxKMZIplB3PQZtybr6H//QXwjRf2LetryOOOlyRDvNKm VlIdUVb1vJ5wLDVSwqkycGk8TXIW4h98t9KSlCz051ALHbx3Q/sx0MksXoLIE5Nmq0xb 7YZgUSnlBYAofI0mUjjs6FVI7mw4WURpJPe0I1AKt5tFnO90lp6SlPvybwFRrz3L8myw TF5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:subject:message-id:mime-version :content-disposition:user-agent; bh=5F8svIh0tWRDWMBHSIfnhdrV8hQWNn2ZUaLRSZBKst8=; b=QBlwGEx0F69zHOBofpVc6YBOPs+KobEj+UjRcqB8YiPLqOT1OfKg1o7nncZI+qFndm a9WkAC3EH3a/V/Wf8TroC2AnkgcXHUEuatRs9hclWJjhvFlekrfpGwnnBtJ/blQYytfF 4aGv6Xwl0XHMxzkSg28q1mmQu54ZUFLBN69t/9Z2Rq0sBHB9bx1XFD46uV1K8yDTICRo bFIYMPFJ/aJfiNVBWoGz0yHH7yHQ+wgt/8slXzcvRzsHdBsW+HFMVunXSOkWUdQoeRCZ nLneL8ap1P1dCXJcBtxDvRf7l9UyoUPQmnBAMO4smJgYTBTxooXnsOrPc4gYaJQ8r2tT JKKQ== X-Gm-Message-State: AJaThX4nEOrwff+56rlqlHhMgkQ0RXVCpr8sUskaKw1P5MsEQsfWvLlB T44Y3UGoOI8Cx1PrBZDFM67a8Km0pJc= X-Google-Smtp-Source: AGs4zMYHrIrLrF5y3mPCaOM6SVghBxOz3ZwafxslecvdlGP9NJ5POrSb0Bdh9F2ykZda2TUZW+oL+g== X-Received: by 10.107.195.14 with SMTP id t14mr12740198iof.238.1510777154968; Wed, 15 Nov 2017 12:19:14 -0800 (PST) Received: from pryzbyj (charmander.telsasoft.com. [50.244.222.1]) by smtp.gmail.com with ESMTPSA id y198sm10065885iod.50.2017.11.15.12.19.14 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 15 Nov 2017 12:19:14 -0800 (PST) Received: by pryzbyj (Postfix, from userid 1000) id 96FED801040; Wed, 15 Nov 2017 14:19:13 -0600 (CST) Date: Wed, 15 Nov 2017 14:19:13 -0600 From: Justin Pryzby To: pgsql-performance@postgresql.org Subject: CREATE STATISTICS and join selectivity Message-ID: <20171115201913.GX2167@telsasoft.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.23 (2014-03-12) List-Archive: List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Mailing-List: pgsql-performance Precedence: bulk Sender: pgsql-performance-owner@postgresql.org I just noticed that PG10 CREATE STATISTICS (dependencies) doesn't seem to work for joins on multiple columns; is that right? With statistics on table for 20171111 but not 20171110: ts=# CREATE STATISTICS x ON site_id,sect_id FROM eric_enodeb_cell_20171111; ts=# ANALYZE VERBOSE eric_enodeb_cell_20171111; ts=# explain ANALYZE SELECT FROM eric_enodeb_cell_20171110 a JOIN eric_enodeb_cell_20171110 b USING(start_time, sect_id) WHERE a.site_id=318 AND sect_id=1489; Nested Loop (cost=0.83..4565.09 rows=1 width=0) (actual time=23.595..69.541 rows=96 loops=1) => bad estimate on redundant WHERE WITHOUT multivar statistics ts=# explain ANALYZE SELECT FROM eric_enodeb_cell_20171111 a JOIN eric_enodeb_cell_20171111 b USING(start_time, sect_id) WHERE a.site_id=318 AND sect_id=1489; Nested Loop (cost=0.83..4862.41 rows=96 width=0) (actual time=0.034..3.882 rows=96 loops=1) => good estimate on redundant WHERE WITH multivar statistics ts=# explain ANALYZE SELECT FROM eric_enodeb_cell_20171110 a JOIN eric_enodeb_cell_20171110 b USING(start_time, sect_id); Merge Join (cost=18249.85..19624.18 rows=54858 width=0) (actual time=157.252..236.945 rows=55050 loops=1) => good estimate on JOIN on SECT_id without stats ts=# explain ANALYZE SELECT FROM eric_enodeb_cell_20171110 a JOIN eric_enodeb_cell_20171110 b USING(start_time, site_id); Merge Join (cost=0.83..14431.81 rows=261499 width=0) (actual time=0.031..259.382 rows=262638 loops=1) => good estimate on JOIN on SITE_id without stats ts=# explain ANALYZE SELECT FROM eric_enodeb_cell_20171111 a JOIN eric_enodeb_cell_20171111 b USING(start_time, site_id); Merge Join (cost=0.83..14706.29 rows=268057 width=0) (actual time=37.360..331.276 rows=268092 loops=1) => good estimate on JOIN on SITE_id with stats ts=# explain ANALYZE SELECT FROM eric_enodeb_cell_20171111 a JOIN eric_enodeb_cell_20171111 b USING(start_time, sect_id); Merge Join (cost=18560.89..19959.67 rows=55944 width=0) (actual time=130.865..198.439 rows=55956 loops=1) => good estimate on JOIN on SECT_id with stats ts=# explain ANALYZE SELECT FROM eric_enodeb_cell_20171111 a JOIN eric_enodeb_cell_20171111 b USING(start_time, sect_id, site_id); Gather (cost=1000.83..12222.06 rows=460 width=0) (actual time=1.686..149.707 rows=55956 loops=1) => poor estimate on redundant JOIN WITH stats (??) I've already fixed our reports to avoid this kind of thing and support our PG95 customers, but I tentatively would've expected PG10 MV stats to "know" that USING(site_id, sect_id) is no more selective than USING(sect_id), same as it knows that's true for WHERE site... AND sect.... Justin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance