Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eEMfg-0005V7-Rx for pgsql-performance@arkaria.postgresql.org; Mon, 13 Nov 2017 21:53:36 +0000 Received: from localhost ([127.0.0.1] helo=postgresql.org) by malur.postgresql.org with smtp (Exim 4.84_2) (envelope-from ) id 1eEMfg-0006in-Aw for pgsql-performance@arkaria.postgresql.org; Mon, 13 Nov 2017 21:53:36 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1eEMdo-0003aB-Hu for pgsql-performance@postgresql.org; Mon, 13 Nov 2017 21:51:40 +0000 Received: from mail-io0-x22a.google.com ([2607:f8b0:4001:c06::22a]) by magus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1eEMdl-00049L-1T for pgsql-performance@postgresql.org; Mon, 13 Nov 2017 21:51:40 +0000 Received: by mail-io0-x22a.google.com with SMTP id x63so10071482ioe.6 for ; Mon, 13 Nov 2017 13:51:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=FNT8YP4nxknV5up2Z/yF4Pm1YwxGtdkKSiD2zyS3WSU=; b=vKfGDVXRGqgQeJIQuKz9WTbPU6AZ33YAr8MkqbWB3cHoxW/R9qR0D04omdmFgmNmOg o0pEbc9ZO6fnzZl3wzl+fl2OWTE0fAJd/41/srwT0DZ8NkFm/D0oaUc/azINkiakQOC1 ZQn1oZtaieNWyjLhb4M0o7UAyGvCbJbw5dpsWyQUVIqFLv6R+gqkxudRQiS72PHa8ZVa XrDpeDHkfyAgEr/XnaydFyQpVc8/tLHxmXYwkZCtY9PxAQCsxhboB8nVI5JbvKWIjYMd atfkjLIe2Uvir5FQMhsfGNySYQyWXPZcKYrroafRTbuldo843t5HViWLpIkl0Cxxd8IQ nUHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=FNT8YP4nxknV5up2Z/yF4Pm1YwxGtdkKSiD2zyS3WSU=; b=KIRB3llGAMSDUhGd0ZxMWEUO73gKIeJVdJco8avZyRqwywM5Z1vJ7vJJnywd6Fbyai zJvUcpNP5M5nd5NzghuyddBITRFXtBnHvk1Iza1YTAUk03z4C16IFmas/sl4lTCCL7M1 m0RH8bBh60elt7l6cF91UN7BdCmn4IFSMwSV7sgQ8cjGPx/n6zXxTv4Q4bSvc6ysvcey g5HUzKD0LJxfT7wQnuj9eTNvhs5nqguuzpznL1TdMIKe21uNM5xEZZXcm1s1HFZsN471 L/wxuoGE6DFmu2mq1lX+k8vFUpxmnjPubnJ10bqwTHdZrk75zgfWfRcLyYRkVXVHx9YH NDrw== X-Gm-Message-State: AJaThX79Q5+O4WMMCoXFg/Jt6PMlhOxknx0thK8PYnwBTErXlIY2NmWt H+5c9F1UrOff2FEbjsVaOFdDGw== X-Google-Smtp-Source: AGs4zMa+oQYC/mnE3KtUtMcx1HKBMyB2GJ1MjYR4wMJgZ1NOhFF/EbvWITyGYX4kvkdQoGXmSE2qJQ== X-Received: by 10.107.174.32 with SMTP id x32mr11998897ioe.44.1510609894508; Mon, 13 Nov 2017 13:51:34 -0800 (PST) Received: from mail-it0-f50.google.com (mail-it0-f50.google.com. [209.85.214.50]) by smtp.gmail.com with ESMTPSA id d1sm4604713iti.35.2017.11.13.13.51.33 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 13 Nov 2017 13:51:33 -0800 (PST) Received: by mail-it0-f50.google.com with SMTP id n134so10646807itg.0 for ; Mon, 13 Nov 2017 13:51:33 -0800 (PST) X-Received: by 10.36.218.69 with SMTP id z66mr1422570itg.131.1510609892813; Mon, 13 Nov 2017 13:51:32 -0800 (PST) MIME-Version: 1.0 Received: by 10.36.209.7 with HTTP; Mon, 13 Nov 2017 13:51:32 -0800 (PST) X-Originating-IP: [88.211.72.10] In-Reply-To: <252cab3895894337a3a88275f423fe0b@index.de> References: <252cab3895894337a3a88275f423fe0b@index.de> From: Oliver Mattos Date: Mon, 13 Nov 2017 21:51:32 +0000 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Query planner gaining the ability to replanning after start of query execution. To: Arne Roland Cc: "pgsql-performance@postgresql.org" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Archive: List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Mailing-List: pgsql-performance Precedence: bulk Sender: pgsql-performance-owner@postgresql.org > Can you be more elaborate how you'd want to go about it? My initial approach would be to try to identify places in the plan where selectivity is seriously over or underestimated. I would reuse the instrumentation infrastructure's counts of filtered and returned tuples for each execnode, and periodically call back into the planner (for example at every power of 2 tuples processed). The planner would have a wrapper to clauselist_selectivity which somehow combines the existing estimate with the filtered and returned tuples so far. Exactly how to merge them isn't clear, but I could imagine using a poisson distribution to calculate the probability that the selectivity estimate is representative of the filtered and returned numbers, and then blending the two linearly based on that estimate. When the planner has re-estimated the cost of the current plan, a discount would be applied for the percentage of each execnode completed (rows processed / estimated rows), and all candidate plans compared. If another candidate plan is now lower cost, the current plan would be terminated[1] by setting a flag instructing each execnode to return as if it had reached the end of the input, although still caching the node selectivity values, and the new plan started from scratch. The old plan is kept within the query planner candidate list, together with it's cached selectivity values. If at some point it again is cheaper, it is started from scratch too. > Even if we know the cardinality is overestimated, we have no idea whether= the cardinality of a =3D 3 or b =3D 40 is wrong or they just correlate The goal would be not to know which is wrong, but to try each, discarding it if it turns out worse than we estimated. Processing a few hundred rows of each of 5 plans is tiny compared to a scan of 1M rows... [1]: An improvement here (with much more code complexity) is to keep multiple partially executed plans around, so that whichever one is most promising can be worked on, but can be halted and resumed later as selectivity (and hence cost) estimates change. On Mon, Nov 13, 2017 at 8:06 PM, Arne Roland wrote: > Hello, > > I'd love to have some sort of dynamic query feedback, yet it's very compl= icated to do it right. I am not convinced that changing the plan during a s= ingle execution is the right way to do it, not only because it sounds intru= sive to do crazy things in the executor, but also because don't understand = why the new plan should be any better than the old one. Can you be more ela= borate how you'd want to go about it? > > In your example (which presumably just has a single relation), we have no= notion of whether the scan returns no rows because we were unlucky, becaus= e just the first few pages were empty of matching rows (which in my experie= nce happens more often), or because the cardinality estimation is wrong. Ev= en if the cardinality estimation is wrong, we have no notion of which predi= cate or predicate combination actually caused the misestimation. If the fir= st few pages where empty, the same might happen with every order (so also w= ith every available indexscan). Imagine a very simple seqscan plan of > select * from mytab where a =3D 3 and b =3D 40 limit 1 > Even if we know the cardinality is overestimated, we have no idea whether= the cardinality of a =3D 3 or b =3D 40 is wrong or they just correlate, so= there is no notion of which is actually the cheapest plan. Usual workaroun= d for most of these queries is to add an order by (which has the nice addit= ion of having a deterministic result) with an appropriate complex index, us= ually resulting in indexscans. > > While we actually know more after the first execution of a nodes like mat= erialize, sort or hash nodes, I rarely encounter materialize nodes in the w= ild. Consequently that is the place where the work is usually already done,= which is especially true with the hash node. Even though it still might be= more optimal to switch from a mergejoin to a hashjoin in some cases, I dou= bt that's worth any work (and even less the maintenance). > > Best regards > Arne Roland > > -----Original Message----- > From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-ow= ner@postgresql.org] On Behalf Of Oliver Mattos > Sent: Monday, November 13, 2017 5:45 PM > To: pgsql-performance@postgresql.org > Subject: [PERFORM] Query planner gaining the ability to replanning after = start of query execution. > > I am interested in giving the query planner the ability to replan (or re-= rank plans) after query execution has begun, based on the progression of th= e query so far. > > Example use case: > > * A LIMIT 1 query is planned using an expensive scan which the planner e= xpects to return a large number of results, and to terminate > early. The reality is the query actually produces no results, and > the scan must run to completion, potentially taking thousands of times lo= nger than expected. > > * If this plans costs were adjusted mid-execution to reflect the fact th= at the scan is producing far fewer rows than expected, then another query p= lan might come out ahead, which would complete far faster. > > > Has this been done before? Are there any pitfalls to beware of? > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance > > > > > -----Original Message----- > From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-ow= ner@postgresql.org] On Behalf Of Oliver Mattos > Sent: Monday, November 13, 2017 5:45 PM > To: pgsql-performance@postgresql.org > Subject: [PERFORM] Query planner gaining the ability to replanning after = start of query execution. > > I am interested in giving the query planner the ability to replan (or re-= rank plans) after query execution has begun, based on the progression of th= e query so far. > > Example use case: > > * A LIMIT 1 query is planned using an expensive scan which the planner e= xpects to return a large number of results, and to terminate > early. The reality is the query actually produces no results, and > the scan must run to completion, potentially taking thousands of times lo= nger than expected. > > * If this plans costs were adjusted mid-execution to reflect the fact th= at the scan is producing far fewer rows than expected, then another query p= lan might come out ahead, which would complete far faster. > > > Has this been done before? Are there any pitfalls to beware of? > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance > > > --=20 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance